Particle swarm optimization and function with several parameters - algorithm

I would like to optimize a function with several parameters using Particle swarm optimization. How can i do it? Everywhere I found this formula 1, but how can I understand this formula, I can optimize a function with only one variable. For example, I have a function with 2 parameters and I want to maximize it. How can I do it with PSO?
vi,d ← ω vi,d + φp rp (pi,d-xi,d) + φg rg (gd-xi,d)
function (x, y)
{
return x + y
}

Since you have just 2 variable to optimize, your search space would be two dimensional. Assume that you want to optimize parameters x1 and x2. Furthermore, x1 is in the range of [a1,b1] and x2 is in the range of [a2,b2]. First you need to initialize a random population of particles (say 30 particles) into the search space boundary and assign random values to velocity vectors (V). Afterwards, you need to evaluate the fitness of the all particles and determine the best particle (Global best). Then you should perform the main upading mechanism of PSO.
This link would be helpful:
http://yarpiz.com/50/ypea102-particle-swarm-optimization

Related

Best way to find all points of lattice in sphere

Given a bunch of arbitrary vectors (stored in a matrix A) and a radius r, I'd like to find all integer-valued linear combinations of those vectors which land inside a sphere of radius r. The necessary coordinates I would then store in a Matrix V. So, for instance, if the linear combination
K=[0; 1; 0]
lands inside my sphere, i.e. something like
if norm(A*K) <= r then
V(:,1)=K
end
etc.
The vectors in A are sure to be the simplest possible basis for the given lattice and the largest vector will have length 1. Not sure if that restricts the vectors in any useful way but I suspect it might. - They won't have as similar directions as a less ideal basis would have.
I tried a few approaches already but none of them seem particularly satisfying. I can't seem to find a nice pattern to traverse the lattice.
My current approach involves starting in the middle (i.e. with the linear combination of all 0s) and go through the necessary coordinates one by one. It involves storing a bunch of extra vectors to keep track of, so I can go through all the octants (in the 3D case) of the coordinates and find them one by one. This implementation seems awfully complex and not very flexible (in particular it doesn't seem to be easily generalizable to arbitrary numbers of dimension - although that isn't strictly necessary for the current purpose, it'd be a nice-to-have)
Is there a nice* way to find all the required points?
(*Ideally both efficient and elegant**. If REALLY necessary, it wouldn't matter THAT much to have a few extra points outside the sphere but preferably not that many more. I definitely do need all the vectors inside the sphere. - if it makes a large difference, I'm most interested in the 3D case.
**I'm pretty sure my current implementation is neither.)
Similar questions I found:
Find all points in sphere of radius r around arbitrary coordinate - this is actually a much more general case than what I'm looking for. I am only dealing with periodic lattices and my sphere is always centered at 0, coinciding with one point on the lattice.
But I don't have a list of points but rather a matrix of vectors with which I can generate all the points.
How to efficiently enumerate all points of sphere in n-dimensional grid - the case for a completely regular hypercubic lattice and the Manhattan-distance. I'm looking for completely arbitary lattices and euclidean distance (or, for efficiency purposes, obviously the square of that).
Offhand, without proving any assertions, I think that 1) if the set of vectors is not of maximal rank then the number of solutions is infinite; 2) if the set is of maximal rank, then the image of the linear transformation generated by the vectors is a subspace (e.g., plane) of the target space, which intersects the sphere in a lower-dimensional sphere; 3) it follows that you can reduce the problem to a 1-1 linear transformation (kxk matrix on a k-dimensional space); 4) since the matrix is invertible, you can "pull back" the sphere to an ellipsoid in the space containing the lattice points, and as a bonus you get a nice geometric description of the ellipsoid (principal axis theorem); 5) your problem now becomes exactly one of determining the lattice points inside the ellipsoid.
The latter problem is related to an old problem (counting the lattice points inside an ellipse) which was considered by Gauss, who derived a good approximation. Determining the lattice points inside an ellipse(oid) is probably not such a tidy problem, but it probably can be reduced one dimension at a time (the cross-section of an ellipsoid and a plane is another ellipsoid).
I found a method that makes me a lot happier for now. There may still be possible improvements, so if you have a better method, or find an error in this code, definitely please share. Though here is what I have for now: (all written in SciLab)
Step 1: Figure out the maximal ranges as defined by a bounding n-parallelotope aligned with the axes of the lattice vectors. Thanks for ElKamina's vague suggestion as well as this reply to another of my questions over on math.se by chappers: https://math.stackexchange.com/a/1230160/49989
function I=findMaxComponents(A,r) //given a matrix A of lattice basis vectors
//and a sphere radius r,
//find the corners of the bounding parallelotope
//built from the lattice, and store it in I.
[dims,vecs]=size(A); //figure out how many vectors there are in A (and, unnecessarily, how long they are)
U=eye(vecs,vecs); //builds matching unit matrix
iATA=pinv(A'*A); //finds the (pseudo-)inverse of A^T A
iAT=pinv(A'); //finds the (pseudo-)inverse of A^T
I=[]; //initializes I as an empty vector
for i=1:vecs //for each lattice vector,
t=r*(iATA*U(:,i))/norm(iAT*U(:,i)) //find the maximum component such that
//it fits in the bounding n-parallelotope
//of a (n-1)-sphere of radius r
I=[I,t(i)]; //and append it to I
end
I=[-I;I]; //also append the minima (by symmetry, the negative maxima)
endfunction
In my question I only asked for a general basis, i.e, for n dimensions, a set of n arbitrary but linearly independent vectors. The above code, by virtue of using the pseudo-inverse, works for matrices of arbitrary shapes and, similarly, Scilab's "A'" returns the conjugate transpose rather than just the transpose of A so it equally should work for complex matrices.
In the last step I put the corresponding minimal components.
For one such A as an example, this gives me the following in Scilab's console:
A =
0.9701425 - 0.2425356 0.
0.2425356 0.4850713 0.7276069
0.2425356 0.7276069 - 0.2425356
r=3;
I=findMaxComponents(A,r)
I =
- 2.9494438 - 3.4186986 - 4.0826424
2.9494438 3.4186986 4.0826424
I=int(I)
I =
- 2. - 3. - 4.
2. 3. 4.
The values found by findMaxComponents are the largest possible coefficients of each lattice vector such that a linear combination with that coefficient exists which still land on the sphere. Since I'm looking for the largest such combinations with integer coefficients, I can safely drop the part after the decimal point to get the maximal plausible integer ranges. So for the given matrix A, I'll have to go from -2 to 2 in the first component, from -3 to 3 in the second and from -4 to 4 in the third and I'm sure to land on all the points inside the sphere (plus superfluous extra points, but importantly definitely every valid point inside) Next up:
Step 2: using the above information, generate all the candidate combinations.
function K=findAllCombinations(I) //takes a matrix of the form produced by
//findMaxComponents() and returns a matrix
//which lists all the integer linear combinations
//in the respective ranges.
v=I(1,:); //starting from the minimal vector
K=[];
next=1; //keeps track of what component to advance next
changed=%F; //keeps track of whether to add the vector to the output
while or(v~=I(2,:)) //as long as not all components of v match all components of the maximum vector
if v <= I(2,:) then //if each current component is smaller than each largest possible component
if ~changed then
K=[K;v]; //store the vector and
end
v(next)=v(next)+1; //advance the component by 1
next=1; //also reset next to 1
changed=%F;
else
v(1:next)=I(1,1:next); //reset all components smaller than or equal to the current one and
next=next+1; //advance the next larger component next time
changed=%T;
end
end
K=[K;I(2,:)]'; //while loop ends a single iteration early so add the maximal vector too
//also transpose K to fit better with the other functions
endfunction
So now that I have that, all that remains is to check whether a given combination actually does lie inside or outside the sphere. All I gotta do for that is:
Step 3: Filter the combinations to find the actually valid lattice points
function points=generatePoints(A,K,r)
possiblePoints=A*K; //explicitly generates all the possible points
points=[];
for i=possiblePoints
if i'*i<=r*r then //filter those that are too far from the origin
points=[points i];
end
end
endfunction
And I get all the combinations that actually do fit inside the sphere of radius r.
For the above example, the output is rather long: Of originally 315 possible points for a sphere of radius 3 I get 163 remaining points.
The first 4 are: (each column is one)
- 0.2425356 0.2425356 1.2126781 - 0.9701425
- 2.4253563 - 2.6678919 - 2.4253563 - 2.4253563
1.6977494 0. 0.2425356 0.4850713
so the remainder of the work is optimization. Presumably some of those loops could be made faster and especially as the number of dimensions goes up, I have to generate an awful lot of points which I have to discard, so maybe there is a better way than taking the bounding n-parallelotope of the n-1-sphere as a starting point.
Let us just represent K as X.
The problem can be represented as:
(a11x1 + a12x2..)^2 + (a21x1 + a22x2..)^2 ... < r^2
(x1,x2,...) will not form a sphere.
This can be done with recursion on dimension--pick a lattice hyperplane direction and index all such hyperplanes that intersect the r-radius ball. The ball intersection of each such hyperplane itself is a ball, in one lower dimension. Repeat. Here's the calling function code in Octave:
function lat_points(lat_bas_mx,rr)
% **globals for hyperplane lattice point recursive function**
clear global; % this seems necessary/important between runs of this function
global MLB;
global NN_hat;
global NN_len;
global INP; % matrix of interior points, each point(vector) a column vector
global ctr; % integer counter, for keeping track of lattice point vectors added
% in the pre-allocated INP matrix; will finish iteration with actual # of points found
ctr = 0; % counts number of ball-interior lattice points found
MLB = lat_bas_mx;
ndim = size(MLB)(1);
% **create hyperplane normal vectors for recursion step**
% given full-rank lattice basis matrix MLB (each vector in lattice basis a column),
% form set of normal vectors between successive, nested lattice hyperplanes;
% store them as columnar unit normal vectors in NN_hat matrix and their lengths in NN_len vector
NN_hat = [];
for jj=1:ndim-1
tmp_mx = MLB(:,jj+1:ndim);
tmp_mx = [NN_hat(:,1:jj-1),tmp_mx];
NN_hat(:,jj) = null(tmp_mx'); % null space of transpose = orthogonal to columns
tmp_len = norm(NN_hat(:,jj));
NN_hat(:,jj) = NN_hat(:,jj)/tmp_len;
NN_len(jj) = dot(MLB(:,jj),NN_hat(:,jj));
if (NN_len(jj)<0) % NN_hat(:,jj) and MLB(:,jj) must have positive dot product
% for cutting hyperplane indexing to work correctly
NN_hat(:,jj) = -NN_hat(:,jj);
NN_len(jj) = -NN_len(jj);
endif
endfor
NN_len(ndim) = norm(MLB(:,ndim));
NN_hat(:,ndim) = MLB(:,ndim)/NN_len(ndim); % the lowest recursion level normal
% is just the last lattice basis vector
% **estimate number of interior lattice points, and pre-allocate memory for INP**
vol_ppl = prod(NN_len); % the volume of the ndim dimensional lattice paralellepiped
% is just the product of the NN_len's (they amount to the nested altitudes
% of hyperplane "paralellepipeds")
vol_bll = exp( (ndim/2)*log(pi) + ndim*log(rr) - gammaln(ndim/2+1) ); % volume of ndim ball, radius rr
est_num_pts = ceil(vol_bll/vol_ppl); % estimated number of lattice points in the ball
err_fac = 1.1; % error factor for memory pre-allocation--assume max of err_fac*est_num_pts columns required in INP
INP = zeros(ndim,ceil(err_fac*est_num_pts));
% **call the (recursive) function**
% for output, global variable INP (matrix of interior points)
% stores each valid lattice point (as a column vector)
clp = zeros(ndim,1); % confirmed lattice point (start at origin)
bpt = zeros(ndim,1); % point at center of ball (initially, at origin)
rd = 1; % initial recursion depth must always be 1
hyp_fun(clp,bpt,rr,ndim,rd);
printf("%i lattice points found\n",ctr);
INP = INP(:,1:ctr); % trim excess zeros from pre-allocation (if any)
endfunction
Regarding the NN_len(jj)*NN_hat(:,jj) vectors--they can be viewed as successive (nested) altitudes in the ndim-dimensional "parallelepiped" formed by the vectors in the lattice basis, MLB. The volume of the lattice basis parallelepiped is just prod(NN_len)--for a quick estimate of the number of interior lattice points, divide the volume of the ndim-ball of radius rr by prod(NN_len). Here's the recursive function code:
function hyp_fun(clp,bpt,rr,ndim,rd)
%{
clp = the lattice point we're entering this lattice hyperplane with
bpt = location of center of ball in this hyperplane
rr = radius of ball
rd = recrusion depth--from 1 to ndim
%}
global MLB;
global NN_hat;
global NN_len;
global INP;
global ctr;
% hyperplane intersection detection step
nml_hat = NN_hat(:,rd);
nh_comp = dot(clp-bpt,nml_hat);
ix_hi = floor((rr-nh_comp)/NN_len(rd));
ix_lo = ceil((-rr-nh_comp)/NN_len(rd));
if (ix_hi<ix_lo)
return % no hyperplane intersections detected w/ ball;
% get out of this recursion level
endif
hp_ix = [ix_lo:ix_hi]; % indices are created wrt the received reference point
hp_ln = length(hp_ix);
% loop through detected hyperplanes (updated)
if (rd<ndim)
bpt_new_mx = bpt*ones(1,hp_ln) + NN_len(rd)*nml_hat*hp_ix; % an ndim by length(hp_ix) matrix
clp_new_mx = clp*ones(1,hp_ln) + MLB(:,rd)*hp_ix; % an ndim by length(hp_ix) matrix
dd_vec = nh_comp + NN_len(rd)*hp_ix; % a length(hp_ix) row vector
rr_new_vec = sqrt(rr^2-dd_vec.^2);
for jj=1:hp_ln
hyp_fun(clp_new_mx(:,jj),bpt_new_mx(:,jj),rr_new_vec(jj),ndim,rd+1);
endfor
else % rd=ndim--so at deepest level of recursion; record the points on the given 1-dim
% "lattice line" that are inside the ball
INP(:,ctr+1:ctr+hp_ln) = clp + MLB(:,rd)*hp_ix;
ctr += hp_ln;
return
endif
endfunction
This has some Octave-y/Matlab-y things in it, but most should be easily understandable; M(:,jj) references column jj of matrix M; the tic ' means take transpose; [A B] concatenates matrices A and B; A=[] declares an empty matrix.
Updated / better optimized from original answer:
"vectorized" the code in the recursive function, to avoid most "for" loops (those slowed it down a factor of ~10; the code now is a bit more difficult to understand though)
pre-allocated memory for the INP matrix-of-interior points (this speeded it up by another order of magnitude; before that, Octave was having to resize the INP matrix for every call to the innermost recursion level--for large matrices/arrays that can really slow things down)
Because this routine was part of a project, I also coded it in Python. From informal testing, the Python version is another 2-3 times faster than this (Octave) version.
For reference, here is the old, much slower code in the original posting of this answer:
% (OLD slower code, using for loops, and constantly resizing
% the INP matrix) loop through detected hyperplanes
if (rd<ndim)
for jj=1:length(hp_ix)
bpt_new = bpt + hp_ix(jj)*NN_len(rd)*nml_hat;
clp_new = clp + hp_ix(jj)*MLB(:,rd);
dd = nh_comp + hp_ix(jj)*NN_len(rd);
rr_new = sqrt(rr^2-dd^2);
hyp_fun(clp_new,bpt_new,rr_new,ndim,rd+1);
endfor
else % rd=ndim--so at deepest level of recursion; record the points on the given 1-dim
% "lattice line" that are inside the ball
for jj=1:length(hp_ix)
clp_new = clp + hp_ix(jj)*MLB(:,rd);
INP = [INP clp_new];
endfor
return
endif

Algorithm to smooth a curve while keeping the area under it constant

Consider a discrete curve defined by the points (x1,y1), (x2,y2), (x3,y3), ... ,(xn,yn)
Define a constant SUM = y1+y2+y3+...+yn. Say we change the value of some k number of y points (increase or decrease) such that the total sum of these changed points is less than or equal to the constant SUM.
What would be the best possible manner to adjust the other y points given the following two conditions:
The total sum of the y points (y1'+y2'+...+yn') should remain constant ie, SUM.
The curve should retain as much of its original shape as possible.
A simple solution would be to define some delta as follows:
delta = (ym1' + ym2' + ym3' + ... + ymk') - (ym1 + ym2 + ym3 + ... + ymk')
and to distribute this delta over the rest of the points equally. Here ym1' is the value of the modified point after modification and ym1 is the value of the modified point before modification to give delta as the total difference in modification.
However this would not ensure a totally smoothed curve as area near changed points would appear ragged. Does a better solution/algorithm exist for the this problem?
I've used the following approach, though it is a bit OTT.
Consider adding d[i] to y[i], to get s[i], the smoothed value.
We seek to minimise
S = Sum{ 1<=i<N-1 | sqr( s[i+1]-2*s[i]+s[i-1] } + f*Sum{ 0<=i<N | sqr( d[i])}
The first term is a sum of the squares of (an approximate) second derivative of the curve, and the second term penalises moving away from the original. f is a (positive) constant. A little algebra recasts this as
S = sqr( ||A*d - b||)
where the matrix A has a nice structure, and indeed A'*A is penta-diagonal, which means that the normal equations (ie d = Inv(A'*A)*A'*b) can be solved efficiently. Note that d is computed directly, there is no need to initialise it.
Given the solution d to this problem we can compute the solution d^ to the same problem but with the constraint One'*d = 0 (where One is the vector of all ones) like this
d^ = d - (One'*d/Q) * e
e = Inv(A'*A)*One
Q = One'*e
What value to use for f? Well a simple approach is to try out this procedure on sample curves for various fs and pick a value that looks good. Another approach is to pick a estimate of smoothness, for example the rms of the second derivative, and then a value that should attain, and then search for an f that gives that value. As a general rule, the bigger f is the less smooth the smoothed curve will be.
Some motivation for all this. The aim is to find a 'smooth' curve 'close' to a given one. For this we need a measure of smoothness (the first term in S) and a measure of closeness (the second term. Why these measures? Well, each are easy to compute, and each are quadratic in the variables (the d[]); this will mean that the problem becomes an instance of linear least squares for which there are efficient algorithms available. Moreover each term in each sum depends on nearby values of the variables, which will in turn mean that the 'inverse covariance' (A'*A) will have a banded structure and so the least squares problem can be solved efficiently. Why introduce f? Well, if we didn't have f (or set it to 0) we could minimise S by setting d[i] = -y[i], getting a perfectly smooth curve s[] = 0, which has nothing to do with the y curve. On the other hand if f is gigantic, then to minimise s we should concentrate on the second term, and set d[i] = 0, and our 'smoothed' curve is just the original. So it's reasonable to suppose that as we vary f, the corresponding solutions will vary between being very smooth but far from y (small f) and being close to y but a bit rough (large f).
It's often said that the normal equations, whose use I advocate here, are a bad way to solve least squares problems, and this is generally true. However with 'nice' banded systems -- like the one here -- the loss of stability through using the normal equations is not so great, while the gain in speed is so great. I've used this approach to smooth curves with many thousands of points in a reasonable time.
To see what A is, consider the case where we had 4 points. Then our expression for S comes down to:
sqr( s[2] - 2*s[1] + s[0]) + sqr( s[3] - 2*s[2] + s[1]) + f*(d[0]*d[0] + .. + d[3]*d[3]).
If we substitute s[i] = y[i] + d[i] in this we get, for example,
s[2] - 2*s[1] + s[0] = d[2]-2*d[1]+d[0] + y[2]-2*y[1]+y[0]
and so we see that for this to be sqr( ||A*d-b||) we should take
A = ( 1 -2 1 0)
( 0 1 -2 1)
( f 0 0 0)
( 0 f 0 0)
( 0 0 f 0)
( 0 0 0 f)
and
b = ( -(y[2]-2*y[1]+y[0]))
( -(y[3]-2*y[2]+y[1]))
( 0 )
( 0 )
( 0 )
( 0 )
In an implementation, though, you probably wouldn't want to form A and b, as they are only going to be used to form the normal equation terms, A'*A and A'*b. It would be simpler to accumulate these directly.
This is a constrained optimization problem. The functional to be minimized is the integrated difference of the original curve and the modified curve. The constraints are the area under the curve and the new locations of the modified points. It is not easy to write such codes on your own. It is better to use some open source optimization codes, like this one: ool.
what about to keep the same dynamic range?
compute original min0,max0 y-values
smooth y-values
compute new min1,max1 y-values
linear interpolate all values to match original min max y
y=min1+(y-min1)*(max0-min0)/(max1-min1)
that is it
Not sure for the area but this should keep the shape much closer to original one. I got this Idea right now while reading your question and now I face similar problem so I try to code it and try right now anyway +1 for the getting me this Idea :)
You can adapt this and combine with the area
So before this compute the area and apply #1..#4 and after that compute new area. Then multiply all values by old_area/new_area ratio. If you have also negative values and not computing absolute area then you have to handle positive and negative areas separately and find multiplication ration to best fit original area for booth at once.
[edit1] some results for constant dynamic range
As you can see the shape is slightly shifting to the left. Each image is after applying few hundreds smooth operations. I am thinking of subdivision to local min max intervals to improve this ...
[edit2] have finished the filter for mine own purposes
void advanced_smooth(double *p,int n)
{
int i,j,i0,i1;
double a0,a1,b0,b1,dp,w0,w1;
double *p0,*p1,*w; int *q;
if (n<3) return;
p0=new double[n<<2]; if (p0==NULL) return;
p1=p0+n;
w =p1+n;
q =(int*)((double*)(w+n));
// compute original min,max
for (a0=p[0],i=0;i<n;i++) if (a0>p[i]) a0=p[i];
for (a1=p[0],i=0;i<n;i++) if (a1<p[i]) a1=p[i];
for (i=0;i<n;i++) p0[i]=p[i]; // store original values for range restoration
// compute local min max positions to p1[]
dp=0.01*(a1-a0); // min delta treshold
// compute first derivation
p1[0]=0.0; for (i=1;i<n;i++) p1[i]=p[i]-p[i-1];
for (i=1;i<n-1;i++) // eliminate glitches
if (p1[i]*p1[i-1]<0.0)
if (p1[i]*p1[i+1]<0.0)
if (fabs(p1[i])<=dp)
p1[i]=0.5*(p1[i-1]+p1[i+1]);
for (i0=1;i0;) // remove zeros from derivation
for (i0=0,i=0;i<n;i++)
if (fabs(p1[i])<dp)
{
if ((i> 0)&&(fabs(p1[i-1])>=dp)) { i0=1; p1[i]=p1[i-1]; }
else if ((i<n-1)&&(fabs(p1[i+1])>=dp)) { i0=1; p1[i]=p1[i+1]; }
}
// find local min,max to q[]
q[n-2]=0; q[n-1]=0; for (i=1;i<n-1;i++) if (p1[i]*p1[i-1]<0.0) q[i-1]=1; else q[i-1]=0;
for (i=0;i<n;i++) // set sign as +max,-min
if ((q[i])&&(p1[i]<-dp)) q[i]=-q[i]; // this shifts smooth curve to the left !!!
// compute weights
for (i0=0,i1=1;i1<n;i0=i1,i1++) // loop through all local min,max intervals
{
for (;(!q[i1])&&(i1<n-1);i1++); // <i0,i1>
b0=0.5*(p[i0]+p[i1]);
b1=fabs(p[i1]-p[i0]);
if (b1>=1e-6)
for (b1=0.35/b1,i=i0;i<=i1;i++) // compute weights bigger near local min max
w[i]=0.8+(fabs(p[i]-b0)*b1);
}
// smooth few times
for (j=0;j<5;j++)
{
for (i=0;i<n ;i++) p1[i]=p[i]; // store data to avoid shifting by using half filtered data
for (i=1;i<n-1;i++) // FIR smooth filter
{
w0=w[i];
w1=(1.0-w0)*0.5;
p[i]=(w1*p1[i-1])+(w0*p1[i])+(w1*p1[i+1]);
}
for (i=1;i<n-1;i++) // avoid local min,max shifting too much
{
if (q[i]>0) // local max
{
if (p[i]<p[i-1]) p[i]=p[i-1]; // can not be lower then neigbours
if (p[i]<p[i+1]) p[i]=p[i+1];
}
if (q[i]<0) // local min
{
if (p[i]>p[i-1]) p[i]=p[i-1]; // can not be higher then neigbours
if (p[i]>p[i+1]) p[i]=p[i+1];
}
}
}
for (i0=0,i1=1;i1<n;i0=i1,i1++) // loop through all local min,max intervals
{
for (;(!q[i1])&&(i1<n-1);i1++); // <i0,i1>
// restore original local min,max
a0=p0[i0]; b0=p[i0];
a1=p0[i1]; b1=p[i1];
if (a0>a1)
{
dp=a0; a0=a1; a1=dp;
dp=b0; b0=b1; b1=dp;
}
b1-=b0;
if (b1>=1e-6)
for (dp=(a1-a0)/b1,i=i0;i<=i1;i++)
p[i]=a0+((p[i]-b0)*dp);
}
delete[] p0;
}
so p[n] is the input/output data. There are few things that can be tweaked like:
weights computation (constants 0.8 and 0.35 means weights are <0.8,0.8+0.35/2>)
number of smooth passes (now 5 in the for loop)
the bigger the weight the less the filtering 1.0 means no change
The main Idea behind is:
find local extremes
compute weights for smoothing
so near local extremes are almost none change of the output
smooth
repair dynamic range per each interval between all local extremes
[Notes]
I did also try to restore the area but that is incompatible with mine task because it distorts the shape a lot. So if you really need the area then focus on that and not on the shape. The smoothing causes signal to shrink mostly so after area restoration the shape rise on magnitude.
Actual filter state has none markable side shifting of shape (which was the main goal for me). Some images for more bumpy signal (the original filter was extremly poor on this):
As you can see no visible signal shape shifting. The local extremes has tendency to create sharp spikes after very heavy smoothing but that was expected
Hope it helps ...

Finding translation and scale on two sets of points to get least square error in their distance?

I have two sets of 3D points (original and reconstructed) and correspondence information about pairs - which point from one set represents the second one. I need to find 3D translation and scaling factor which transforms reconstruct set so the sum of square distances would be least (rotation would be nice too, but points are rotated similarly, so this is not main priority and might be omitted in sake of simplicity and speed). And so my question is - is this solved and available somewhere on the Internet? Personally, I would use least square method, but I don't have much time (and although I'm somewhat good at math, I don't use it often, so it would be better for me to avoid it), so I would like to use other's solution if it exists. I prefer solution in C++, for example using OpenCV, but algorithm alone is good enough.
If there is no such solution, I will calculate it by myself, I don't want to bother you so much.
SOLUTION: (from your answers)
For me it's Kabsch alhorithm;
Base info: http://en.wikipedia.org/wiki/Kabsch_algorithm
General solution: http://nghiaho.com/?page_id=671
STILL NOT SOLVED:
I also need scale. Scale values from SVD are not understandable for me; when I need scale about 1-4 for all axises (estimated by me), SVD scale is about [2000, 200, 20], which is not helping at all.
Since you are already using Kabsch algorithm, just have a look at Umeyama's paper which extends it to get scale. All you need to do is to get the standard deviation of your points and calculate scale as:
(1/sigma^2)*trace(D*S)
where D is the diagonal matrix in SVD decomposition in the rotation estimation and S is either identity matrix or [1 1 -1] diagonal matrix, depending on the sign of determinant of UV (which Kabsch uses to correct reflections into proper rotations). So if you have [2000, 200, 20], multiply the last element by +-1 (depending on the sign of determinant of UV), sum them and divide by the standard deviation of your points to get scale.
You can recycle the following code, which is using the Eigen library:
typedef Eigen::Matrix<double, 3, 1, Eigen::DontAlign> Vector3d_U; // microsoft's 32-bit compiler can't put Eigen::Vector3d inside a std::vector. for other compilers or for 64-bit, feel free to replace this by Eigen::Vector3d
/**
* #brief rigidly aligns two sets of poses
*
* This calculates such a relative pose <tt>R, t</tt>, such that:
*
* #code
* _TyVector v_pose = R * r_vertices[i] + t;
* double f_error = (r_tar_vertices[i] - v_pose).squaredNorm();
* #endcode
*
* The sum of squared errors in <tt>f_error</tt> for each <tt>i</tt> is minimized.
*
* #param[in] r_vertices is a set of vertices to be aligned
* #param[in] r_tar_vertices is a set of vertices to align to
*
* #return Returns a relative pose that rigidly aligns the two given sets of poses.
*
* #note This requires the two sets of poses to have the corresponding vertices stored under the same index.
*/
static std::pair<Eigen::Matrix3d, Eigen::Vector3d> t_Align_Points(
const std::vector<Vector3d_U> &r_vertices, const std::vector<Vector3d_U> &r_tar_vertices)
{
_ASSERTE(r_tar_vertices.size() == r_vertices.size());
const size_t n = r_vertices.size();
Eigen::Vector3d v_center_tar3 = Eigen::Vector3d::Zero(), v_center3 = Eigen::Vector3d::Zero();
for(size_t i = 0; i < n; ++ i) {
v_center_tar3 += r_tar_vertices[i];
v_center3 += r_vertices[i];
}
v_center_tar3 /= double(n);
v_center3 /= double(n);
// calculate centers of positions, potentially extend to 3D
double f_sd2_tar = 0, f_sd2 = 0; // only one of those is really needed
Eigen::Matrix3d t_cov = Eigen::Matrix3d::Zero();
for(size_t i = 0; i < n; ++ i) {
Eigen::Vector3d v_vert_i_tar = r_tar_vertices[i] - v_center_tar3;
Eigen::Vector3d v_vert_i = r_vertices[i] - v_center3;
// get both vertices
f_sd2 += v_vert_i.squaredNorm();
f_sd2_tar += v_vert_i_tar.squaredNorm();
// accumulate squared standard deviation (only one of those is really needed)
t_cov.noalias() += v_vert_i * v_vert_i_tar.transpose();
// accumulate covariance
}
// calculate the covariance matrix
Eigen::JacobiSVD<Eigen::Matrix3d> svd(t_cov, Eigen::ComputeFullU | Eigen::ComputeFullV);
// calculate the SVD
Eigen::Matrix3d R = svd.matrixV() * svd.matrixU().transpose();
// compute the rotation
double f_det = R.determinant();
Eigen::Vector3d e(1, 1, (f_det < 0)? -1 : 1);
// calculate determinant of V*U^T to disambiguate rotation sign
if(f_det < 0)
R.noalias() = svd.matrixV() * e.asDiagonal() * svd.matrixU().transpose();
// recompute the rotation part if the determinant was negative
R = Eigen::Quaterniond(R).normalized().toRotationMatrix();
// renormalize the rotation (not needed but gives slightly more orthogonal transformations)
double f_scale = svd.singularValues().dot(e) / f_sd2_tar;
double f_inv_scale = svd.singularValues().dot(e) / f_sd2; // only one of those is needed
// calculate the scale
R *= f_inv_scale;
// apply scale
Eigen::Vector3d t = v_center_tar3 - (R * v_center3); // R needs to contain scale here, otherwise the translation is wrong
// want to align center with ground truth
return std::make_pair(R, t); // or put it in a single 4x4 matrix if you like
}
For 3D points the problem is known as the Absolute Orientation problem. A c++ implementation is available from Eigen http://eigen.tuxfamily.org/dox/group__Geometry__Module.html#gab3f5a82a24490b936f8694cf8fef8e60 and paper http://web.stanford.edu/class/cs273/refs/umeyama.pdf
you can use it via opencv by converting the matrices to eigen with cv::cv2eigen() calls.
Start with translation of both sets of points. So that their centroid coincides with the origin of the coordinate system. Translation vector is just the difference between these centroids.
Now we have two sets of coordinates represented as matrices P and Q. One set of points may be obtained from other one by applying some linear operator (which performs both scaling and rotation). This operator is represented by 3x3 matrix X:
P * X = Q
To find proper scale/rotation we just need to solve this matrix equation, find X, then decompose it into several matrices, each representing some scaling or rotation.
A simple (but probably not numerically stable) way to solve it is to multiply both parts of the equation to the transposed matrix P (to get rid of non-square matrices), then multiply both parts of the equation to the inverted PT * P:
PT * P * X = PT * Q
X = (PT * P)-1 * PT * Q
Applying Singular value decomposition to matrix X gives two rotation matrices and a matrix with scale factors:
X = U * S * V
Here S is a diagonal matrix with scale factors (one scale for each coordinate), U and V are rotation matrices, one properly rotates the points so that they may be scaled along the coordinate axes, other one rotates them once more to align their orientation to second set of points.
Example (2D points are used for simplicity):
P = 1 2 Q = 7.5391 4.3455
2 3 12.9796 5.8897
-2 1 -4.5847 5.3159
-1 -6 -15.9340 -15.5511
After solving the equation:
X = 3.3417 -1.2573
2.0987 2.8014
After SVD decomposition:
U = -0.7317 -0.6816
-0.6816 0.7317
S = 4 0
0 3
V = -0.9689 -0.2474
-0.2474 0.9689
Here SVD has properly reconstructed all manipulations I performed on matrix P to get matrix Q: rotate by the angle 0.75, scale X axis by 4, scale Y axis by 3, rotate by the angle -0.25.
If sets of points are scaled uniformly (scale factor is equal by each axis), this procedure may be significantly simplified.
Just use Kabsch algorithm to get translation/rotation values. Then perform these translation and rotation (centroids should coincide with the origin of the coordinate system). Then for each pair of points (and for each coordinate) estimate Linear regression. Linear regression coefficient is exactly the scale factor.
A good explanation Finding optimal rotation and translation between corresponding 3D points
The code is in matlab but it's trivial to convert to opengl using the cv::SVD function
You might want to try ICP (Iterative closest point).
Given two sets of 3d points, it will tell you the transformation (rotation + translation) to go from the first set to the second one.
If you're interested in a c++ lightweight implementation, try libicp.
Good luck!
The general transformation, as well the scale can be retrieved via Procrustes Analysis. It works by superimposing the objects on top of each other and tries to estimate the transformation from that setting. It has been used in the context of ICP, many times. In fact, your preference, Kabash algorithm is a special case of this.
Moreover, Horn's alignment algorithm (based on quaternions) also finds a very good solution, while being quite efficient. A Matlab implementation is also available.
Scale can be inferred without SVD, if your points are uniformly scaled in all directions (I could not make sense of SVD-s scale matrix either). Here is how I solved the same problem:
Measure distances of each point to other points in the point cloud to get a 2d table of distances, where entry at (i,j) is norm(point_i-point_j). Do the same thing for the other point cloud, so you get two tables -- one for original and the other for reconstructed points.
Divide all values in one table by the corresponding values in the other table. Because the points correspond to each other, the distances do too. Ideally, the resulting table has all values being equal to each other, and this is the scale.
The median value of the divisions should be pretty close to the scale you are looking for. The mean value is also close, but I chose median just to exclude outliers.
Now you can use the scale value to scale all the reconstructed points and then proceed to estimating the rotation.
Tip: If there are too many points in the point clouds to find distances between all of them, then a smaller subset of distances will work, too, as long as it is the same subset for both point clouds. Ideally, just one distance pair would work if there is no measurement noise, e.g when one point cloud is directly derived from the other by just rotating it.
you can also use ScaleRatio ICP proposed by BaoweiLin
The code can be found in github

support vector machines - a simple explanation?

So, i'm trying to understand how the SVM algorithm works but i just cannot figure out how you transform some datasets in points of n-dimensional plane that would have a mathematical meaning in order to separate the points through a hyperplane and clasify them.
There's an example here, they are trying to clasify pictures of tigers and elephants, they say "We digitize them into 100x100 pixel images, so we have x in n-dimensional plane, where n=10,000", but my question is how do they transform the matrices that actually represent just some color codes IN points that have a methematical meaning in order to clasify them in 2 categories?
Probably someone can explain me this in a 2D example because any graphical representation i see it's just 2D, not nD.
The short answer is: they don't transform the matrices, but treat each element in the matrix as a dimension (in machine learning each element would be called a Feature).
Thus, they need classify elements with 100x100 = 10000 features each. In the linear SVM case, they do so using a hyperplane, which divides the 10,000-dimensional space into two distinct regions.
A longer answer would be:
Consider your 2D case. Now, you want to separate a set of two-dimensional elements. This means that each element in your set can be described mathematically as a 2-tuple, namely: e = (x1, x2). For example, in your figure, some full dots might be: {(1,3), (2,4)}, and some hollow ones might be {(4,2), (5,1)}. Note that in order to classify them with a linear classifier, you need a 2-dimensional linear classifier, which would yield a decision rule which might look like this:
e = (x1, x2)
if (w1 * x1 + w2 * x2) > C : decide that e is a full dot.
otherwise : e is hollow.
Note that the classifier is linear, as it is a linear combination of the elements of e. The 'w's are called 'weights', and 'C' is the decision threshold. a linear function with 2-elements as above is simply a line, that's why in your figures the H's are lines.
Now, back to our n-dimensional case, you can probably figure our that a line will not do the trick. In the 3D case, we would need a plane: (w1 * x1 + w2 * x2 + w2 * x3) > C, and in the n-dimensional case, we would need a hyperplane: (w1 * x1 + w2 * x2 + ... + wn * xn) > C, which is damn hard to imagine, none the less to draw :-).

Fastest way to fit a parabola to set of points?

Given a set of points, what's the fastest way to fit a parabola to them? Is it doing the least squares calculation or is there an iterative way?
Thanks
Edit:
I think gradient descent is the way to go. The least squares calculation would have been a little bit more taxing (having to do qr decomposition or something to keep things stable).
If the points have no error associated, you may interpolate by three points. Otherwise least squares or any equivalent formulation is the way to go.
I recently needed to find a parabola that passes through 3 points.
suppose you have (x1,y1), (x2,y2) and (x3,y3) and you want the parabola
y-y0 = a*(x-x0)^2
to pass through them: find y0, x0, and a.
You can do some algebra and get this solution (providing the points aren't all on a line) :
let c = (y1-y2) / (y2-y3)
x0 = ( -x1^2 + x2^2 + c*( x2^2 - x3^2 ) ) / (2.0*( -x1+x2 + c*x2 - c*x3 ))
a = (y1-y2) / ( (x1-x0)^2 - (x2-x0)^2 )
y0 = y1 - a*(x1-x0)^2
Note in the equation for c if y2==y3 then you've got a problem. So in my algorithm I check for this and swap say x1, y1 with x2, y2 and then proceed.
hope that helps!
Paul Probert
A calculated solution is almost always faster than an iterative solution. The "exception" would be for low iteration counts and complex calculations.
I would use the least squares method. I've only every coded it for linear regression fits but it can be used for parabolas (I had reason to look it up recently - sources included an old edition of "Numerical Recipes" Press et al; and "Engineering Mathematics" Kreyzig).
ALGORITHM FOR PARABOLA
Read no. of data points n and order of polynomial Mp .
Read data values .
If n< Mp
[ Regression is not possible ]
stop
else
continue ;
Set M=Mp + 1 ;
Compute co-efficient of C-matrix .
Compute co-efficient of B-matrix .
Solve for the co-efficients
a1,a2,. . . . . . . an .
Write the co-efficient .
Estimate the function value at the glren of independents variables .
Using the free arbitrary accuracy math program "PARI" (for Mac or PC):
Here is how I would fit a parabola to a set of 641 points,
and I also show how to find the minimum of that parabola:
Set a high number of digits of precision:
\p 300
Write the data points to a text file separated by one space
for each data point
(use ASCII characters in base ten, no space at file start or file end, and no returns, write extremely large or small floating points as for example
"9.0E-23" but not "9.0D-23" ).
make a string to point to that file:
fileone="./desktop/data.txt"
read that file into PARI using the following instructions:
fileopen(fileone,r)
readsplit(file) = my(cmd);cmd="perl -ne \"chomp; print '[' . join(',', split(/ +/)) . ']\n';\"";eval(externstr(Str(cmd," ",file)))
readsplit(fileone)
Label that data with a name:
in = %
V = in[1]
Define a least squares fit function:
lsf(X,Y,n) = my(M=matrix(#X,n+1,i,j,X[i]^(j-1)));fit=Polrev(matsolve(M~*M,M~*Y~))
Apply that lsf function to your 641 data points:
lsf([-320..320],V, 2)
Then if you want to show the minimum of that parabolic fit, enter:
xextreme = solve (x=-1000,1000,eval(deriv(fit)));print (xextreme*(124.5678-123.5678)/640+(124.5678+123.5678)/2);x=xextreme;print(eval(fit))
(I had to adjust for my particular x-axis scaling before the "print" statement in that command line above).
(Note: A sacrifice made to simplify this algorithm
causes it to work only
when the data set has equally spaced x-axis coordinates.)
I was worried that my last post
was too compact to follow and
too hard to convert to other environments.
I would like to show here how to solve the
generalized problem of parabolic data fitting explicitly
without specialized matrix math terminology;
and so that each multiplication, division,
subtraction and addition can be seen at once.
To save ink this fit reparameterizes the x-axis as evenly
spaced points centered on zero
so that odd powered sums all get eliminated
(saving a lot of space and time),
so the x-coordinates of the N data points
are effectively labeled by points
of this vector: X=[-(N-1)/2..(N-1)/2].
For example "xextreme" will be returned
versus those integer indices
and so (if desired) a simple (consumes very little CPU time)
linear transformation must be applied after the algorithm below
to get it versus your problem's particular x-axis labels.
This is written in the language of
the free program "PARI" but all the
commands are simple to translate to any language.
Step 1: assign a label to the y-axis data:
? V=[5,2,1,2,5]
"PARI" confirms that entry:
%280 = [5, 2, 1, 2, 5]
Then type in the following processing algorithm
which calculates a best fit parabola
through any y-axis data set with constant x-axis separation:
? g=#V;h=(g-1)*g*(g+1)/3;i=h*(3*g*g-7)/5;\
a=sum(i=1,g,V[i]);b=sum(i=1,g,(2*i-1-g)*V[i]);c=sum(i=1,g,(2*i-1-g)*(2*i-1-g)*V[i]);\
A=matdet([a,c;h,i])/matdet([g,h;h,i]);B=b/h*2;C=matdet([g,h;a,c])/matdet([g,h;h,i])*4;\
xextreme=-B/(2*C);yextreme=-B*B/(4*C)+A;fit=Polrev([A,B,C]);\
print("\n","y of extreme is ",yextreme,"\n","which occurs this many data points from center of data: ",xextreme)
(Note for non-PARI users:
the command "matdet([a,c;h,i])"
is just another way of entering "a*i-c*h")
Those commands then produce the following screen output:
y of extreme is 1
which occurs this many data points from center of data: 0
The algorithm stores the polynomial of the fit in the variable "fit":
? fit
%282 = x^2 + 1
?
(Note that to make that algorithm short
the x-axis labels are assigned as X=[-(N-1)/2..(N-1)/2],
thus they are X=[-2,-1,0,1,2]
To correct that
for the same polynomial as parameterized
by an x-axis coordinate data set of say X=[−1,0,1,2,3]:
just apply a simple linear transform, in this case:
"x^2 + 1" --> "(t - 1)^2 + 1".)

Resources