Compressing GPS Points - algorithm

I have a device that records GPS data. A reading is taken every 2-10 seconds. For an activity taking 2 hours there are a lot of GPS points.
Does anyone know of an algorithm for compressing the dataset by removing redundant data points. i.e. If a series of data points are all in a straight line then only the start and end point are required.

check out the Douglas Peucker Algorithm which is used to simplify a polygon. i´ve used this successfully to reduce the amount of gps waypoints when trasmitted to clients for displaying purposes.

You probably want to approximate your path x(t), y(t) with a polynomial approximation of it. Are you looking for something like this: http://www.youtube.com/watch?v=YtcZXlKbDJY ???

You can remove redundant points by performing a very basic simplification based on calculation of slope between subsequent points.
Here is a bit of but not complete C++ code presenting possible algorithm:
struct Point
{
double x;
double y;
};
double calculate_slope(Point const& p1, Point const& p2)
{
// dy y2 - y1
// m = ---- = ---------
// dx x2 - x1
return ((p2.y - p1.y) / (p2.x - p1.x));
}
// 1. Read first two points from GPS stream source
Point p0 = ... ;
Point p1 = ... ;
// 2. Accept p0 as it's first point
// 3. Calculate slope
double m0 = calculate_slope(p0, p1);
// 4. next point is now previous
p0 = p1;
// 5. Read another point
p1 = ... ;
double m1 = calculate_slope(p0, p1);
// 6. Eliminate Compare slopes
double const tolerance = 0.1; // choose your tolerance
double const diff = m0 - m1;
bool if (!((diff <= tolerance) && (diff >= -tolerance)))
{
// 7. Accept p0 and eliminate p1
m0 = m1;
}
// Repeat steps from 4 to 7 for the rest of points.
I hope it helps.

There is a research paper on Compressing GPS Data on Mobile Devices.
Additionally, you can look at this CodeProject article on Writing GPS Applications. I think the problem you will have is not for straight points, but curved roads. It all depends on how precise you want your path to be.

The code given above has a couple of issues that might make it unsuitable:
"same slope" tolerance measures difference in gradient rather than angle, so NNE to NNW is considered a much bigger difference than NE to SE (assuming y axis runs North-South).
One way of addressing this would be for the tolerance to measure how the dot product of two segments compares with the product of their lengths. (It might help understanding to remember that dot product of two vectors is the product of their lengths and the cosine of the angle between them.) However, see next point.
Considers only slope error rather than position error, so a long ENE segment followed by long ESE segment is just as likely to be compressed to a single segment as a string of short segments alternating between ENE and ESE.
The approach that I was thinking of would be to look at what vector graphics applications do to convert a list of cursor coordinates into a sequence of curves. E.g. see lib2geom's bezier-utils.cpp. Note that (i) it's almost entirely position-based rather than direction-based; and (ii) it gives cubic bézier curves as output rather than a polyline, though you could use the same approach to give polyline output if that's preferred (in which case the Newton-Raphson step becomes essentially just a simple dot product).

Related

How to find the order of discrete point-set efficiently?

I have a series of discrete point on a plane, However, their order is scattered. Here is an instance:
To connect them with a smooth curve, I wrote a findSmoothBoundary() to achieve the smooth boundary.
Code
function findSmoothBoundary(boundaryPointSet)
%initialize the current point
currentP = boundaryPointSet(1,:);
%Create a space smoothPointsSet to store the point
smoothPointsSet = NaN*ones(length(boundaryPointSet),2);
%delete the current point from the boundaryPointSet
boundaryPointSet(1,:) = [];
ptsNum = 1; %record the number of smoothPointsSet
smoothPointsSet(ptsNum,:) = currentP;
while ~isempty(boundaryPointSet)
%ultilize the built-in knnsearch() to
%achieve the nearest point of current point
nearestPidx = knnsearch(boundaryPointSet,currentP);
currentP = boundaryPointSet(nearestPidx,:);
ptsNum = ptsNum + 1;
smoothPointsSet(ptsNum,:) = currentP;
%delete the nearest point from boundaryPointSet
boundaryPointSet(nearestPidx,:) = [];
end
%visualize the smooth boundary
plot(smoothPointsSet(:,1),smoothPointsSet(:,2))
axis equal
end
Although findSmoothBoundary() can find the smooth boundary rightly, but its efficiency is much lower ( About the data, please see here)
So I would like to know:
How to find the discrete point order effieciently?
Data
theta = linspace(0,2*pi,1000)';
boundaryPointSet= [2*sin(theta),cos(theta)];
tic;
findSmoothBoundary(boundaryPointSet)
toc;
%Elapsed time is 4.570719 seconds.
This answer is not perfect because I'll have to make a few hypothesis in order for it to work. However, for a vast majority of cases, it should works as intended. Moreover, from the link you gave in the comments, I think these hypothesis are at least weak, if not verified by definition :
1. The point form a single connected region
2. The center of mass of your points lies in the convex hull of those points
If these hypothesis are respected, you can do the following (Full code available at the end):
Step 1 : Calculate the center of mass of your points
Means=mean(boundaryPointSet);
Step 2 : Change variables to set the origin to the center of mass
boundaryPointSet(:,1)=boundaryPointSet(:,1)-Means(1);
boundaryPointSet(:,2)=boundaryPointSet(:,2)-Means(2);
Step3 : Convert coordinates to polar
[Angles,Radius]=cart2pol(boundaryPointSet(:,1),boundaryPointSet(:,2));
Step4 : Sort the Angle and use this sorting to sort the Radius
[newAngles,ids]=sort(Angles);
newRadius=Radius(ids);
Step5 : Go back to cartesian coordinates and re-add the coordinates of the center of mass:
[X,Y]=pol2cart(newAngles,newRadius);
X=X+Means(1);
Y=Y+means(2);
Full Code
%%% Find smooth boundary
fid=fopen('SmoothBoundary.txt');
A=textscan(fid,'%f %f','delimiter',',');
boundaryPointSet=cell2mat(A);
boundaryPointSet(any(isnan(boundaryPointSet),2),:)=[];
idx=randperm(size(boundaryPointSet,1));
boundaryPointSet=boundaryPointSet(idx,:);
tic
plot(boundaryPointSet(:,1),boundaryPointSet(:,2))
%% Find mean value of all parameters
Means=mean(boundaryPointSet);
%% Center values around Mean point
boundaryPointSet(:,1)=boundaryPointSet(:,1)-Means(1);
boundaryPointSet(:,2)=boundaryPointSet(:,2)-Means(2);
%% Get polar coordinates of your points
[Angles,Radius]=cart2pol(boundaryPointSet(:,1),boundaryPointSet(:,2));
[newAngles,ids]=sort(Angles);
newRadius=Radius(ids);
[X,Y]=pol2cart(newAngles,newRadius);
X=X+Means(1);
Y=Y+means(2);
toc
figure
plot(X,Y);
Note : As your values are already sorted in your input file, I had to mess it up a bit by permutating them
Outputs :
Boundary
Elapsed time is 0.131808 seconds.
Messed Input :
Output :

Collision of circular objects

I'm going to develop carom board game. I'm having the problem with the collision of two pieces. How to find the collision point of two pieces. And then how to find the angle and distance the pieces travel after collision.I found the solution of the collision point at circle-circle collision. here the solution is described with trigonometry, but I want the solution with vector math. With which the problem of the distance covered after collision will also be solve easily.
You do not need to find the collision point for the collision computation itself. To detect a collision event, you only need to compare the distance of the centers go the sum of radii
sqrt(sqr(x2-x1)+sqr(y2-y1))<=r1+r2
or
dx*dx+dy*dy <= (r1+r2)*(r1+r2)
where (x1,y1) and (x2,y2) are the positions of disks 1 (with mass m1, radius r1 and velocity (vx1,vy1)) and 2. Differences are always 2 minus 1, dx=x2-x1 etc.
You will almost never get that the collision happens at the sample time of the time discretisation. With the above formula, the circles already have overlap. Depending on time step and general velocities this may be negligible or can result in severe over-shooting. The following simple computation assumes slow motion, i.e., very small overlap during the last time step.
The only thing that happens in the fully elastic collision of non-rotating disks is a change in velocity. This change has to happen only once, when further movement would further reduce the distance, i.e., if
dx*dvx+dy*dvy < 0
where dvx=vx2-vx1 etc.
Using these ideas, the computation looks like this (see https://stackoverflow.com/a/23193044/3088138 for details)
dx = x2-x1; dy = y2-y1;
dist2 = dx*dx + dy*dy;
R = r1+r2;
if ( dist2 <= R*R )
{
dvx=vx2-vx1; dvy=vy2-vy1;
dot = dx*dvx + dy*dvy;
if ( dot < 0 )
{
factor = 2/(m1+m2)*dot/dist2;
vx1 += m2*factor*dx;
vy1 += m2*factor*dy;
vx2 -= m1*factor*dx;
vy2 -= m1*factor*dy;
}
}

Multiliteration implementation with inaccurate distance data

I am trying to create an android smartphone application which uses Apples iBeacon technology to determine the current indoor location of itself. I already managed to get all available beacons and calculate the distance to them via the rssi signal.
Currently I face the problem, that I am not able to find any library or implementation of an algorithm, which calculates the estimated location in 2D by using 3 (or more) distances of fixed points with the condition, that these distances are not accurate (which means, that the three "trilateration-circles" do not intersect in one point).
I would be deeply grateful if anybody can post me a link or an implementation of that in any common programming language (Java, C++, Python, PHP, Javascript or whatever). I already read a lot on stackoverflow about that topic, but could not find any answer I were able to convert in code (only some mathematical approaches with matrices and inverting them, calculating with vectors or stuff like that).
EDIT
I thought about an own approach, which works quite well for me, but is not that efficient and scientific. I iterate over every meter (or like in my example 0.1 meter) of the location grid and calculate the possibility of that location to be the actual position of the handset by comparing the distance of that location to all beacons and the distance I calculate with the received rssi signal.
Code example:
public Location trilaterate(ArrayList<Beacon> beacons, double maxX, double maxY)
{
for (double x = 0; x <= maxX; x += .1)
{
for (double y = 0; y <= maxY; y += .1)
{
double currentLocationProbability = 0;
for (Beacon beacon : beacons)
{
// distance difference between calculated distance to beacon transmitter
// (rssi-calculated distance) and current location:
// |sqrt(dX^2 + dY^2) - distanceToTransmitter|
double distanceDifference = Math
.abs(Math.sqrt(Math.pow(beacon.getLocation().x - x, 2)
+ Math.pow(beacon.getLocation().y - y, 2))
- beacon.getCurrentDistanceToTransmitter());
// weight the distance difference with the beacon calculated rssi-distance. The
// smaller the calculated rssi-distance is, the more the distance difference
// will be weighted (it is assumed, that nearer beacons measure the distance
// more accurate)
distanceDifference /= Math.pow(beacon.getCurrentDistanceToTransmitter(), 0.9);
// sum up all weighted distance differences for every beacon in
// "currentLocationProbability"
currentLocationProbability += distanceDifference;
}
addToLocationMap(currentLocationProbability, x, y);
// the previous line is my approach, I create a Set of Locations with the 5 most probable locations in it to estimate the accuracy of the measurement afterwards. If that is not necessary, a simple variable assignment for the most probable location would do the job also
}
}
Location bestLocation = getLocationSet().first().location;
bestLocation.accuracy = calculateLocationAccuracy();
Log.w("TRILATERATION", "Location " + bestLocation + " best with accuracy "
+ bestLocation.accuracy);
return bestLocation;
}
Of course, the downside of that is, that I have on a 300m² floor 30.000 locations I had to iterate over and measure the distance to every single beacon I got a signal from (if that would be 5, I do 150.000 calculations only for determine a single location). That's a lot - so I will let the question open and hope for some further solutions or a good improvement of this existing solution in order to make it more efficient.
Of course it has not to be a Trilateration approach, like the original title of this question was, it is also good to have an algorithm which includes more than three beacons for the location determination (Multilateration).
If the current approach is fine except for being too slow, then you could speed it up by recursively subdividing the plane. This works sort of like finding nearest neighbors in a kd-tree. Suppose that we are given an axis-aligned box and wish to find the approximate best solution in the box. If the box is small enough, then return the center.
Otherwise, divide the box in half, either by x or by y depending on which side is longer. For both halves, compute a bound on the solution quality as follows. Since the objective function is additive, sum lower bounds for each beacon. The lower bound for a beacon is the distance of the circle to the box, times the scaling factor. Recursively find the best solution in the child with the lower lower bound. Examine the other child only if the best solution in the first child is worse than the other child's lower bound.
Most of the implementation work here is the box-to-circle distance computation. Since the box is axis-aligned, we can use interval arithmetic to determine the precise range of distances from box points to the circle center.
P.S.: Math.hypot is a nice function for computing 2D Euclidean distances.
Instead of taking confidence levels of individual beacons into account, I would instead try to assign an overall confidence level for your result after you make the best guess you can with the available data. I don't think the only available metric (perceived power) is a good indication of accuracy. With poor geometry or a misbehaving beacon, you could be trusting poor data highly. It might make better sense to come up with an overall confidence level based on how well the perceived distance to the beacons line up with the calculated point assuming you trust all beacons equally.
I wrote some Python below that comes up with a best guess based on the provided data in the 3-beacon case by calculating the two points of intersection of circles for the first two beacons and then choosing the point that best matches the third. It's meant to get started on the problem and is not a final solution. If beacons don't intersect, it slightly increases the radius of each up until they do meet or a threshold is met. Likewise, it makes sure the third beacon agrees within a settable threshold. For n-beacons, I would pick 3 or 4 of the strongest signals and use those. There are tons of optimizations that could be done and I think this is a trial-by-fire problem due to the unwieldy nature of beaconing.
import math
beacons = [[0.0,0.0,7.0],[0.0,10.0,7.0],[10.0,5.0,16.0]] # x, y, radius
def point_dist(x1,y1,x2,y2):
x = x2-x1
y = y2-y1
return math.sqrt((x*x)+(y*y))
# determines two points of intersection for two circles [x,y,radius]
# returns None if the circles do not intersect
def circle_intersection(beacon1,beacon2):
r1 = beacon1[2]
r2 = beacon2[2]
dist = point_dist(beacon1[0],beacon1[1],beacon2[0],beacon2[1])
heron_root = (dist+r1+r2)*(-dist+r1+r2)*(dist-r1+r2)*(dist+r1-r2)
if ( heron_root > 0 ):
heron = 0.25*math.sqrt(heron_root)
xbase = (0.5)*(beacon1[0]+beacon2[0]) + (0.5)*(beacon2[0]-beacon1[0])*(r1*r1-r2*r2)/(dist*dist)
xdiff = 2*(beacon2[1]-beacon1[1])*heron/(dist*dist)
ybase = (0.5)*(beacon1[1]+beacon2[1]) + (0.5)*(beacon2[1]-beacon1[1])*(r1*r1-r2*r2)/(dist*dist)
ydiff = 2*(beacon2[0]-beacon1[0])*heron/(dist*dist)
return (xbase+xdiff,ybase-ydiff),(xbase-xdiff,ybase+ydiff)
else:
# no intersection, need to pseudo-increase beacon power and try again
return None
# find the two points of intersection between beacon0 and beacon1
# will use beacon2 to determine the better of the two points
failing = True
power_increases = 0
while failing and power_increases < 10:
res = circle_intersection(beacons[0],beacons[1])
if ( res ):
intersection = res
else:
beacons[0][2] *= 1.001
beacons[1][2] *= 1.001
power_increases += 1
continue
failing = False
# make sure the best fit is within x% (10% of the total distance from the 3rd beacon in this case)
# otherwise the results are too far off
THRESHOLD = 0.1
if failing:
print 'Bad Beacon Data (Beacon0 & Beacon1 don\'t intersection after many "power increases")'
else:
# finding best point between beacon1 and beacon2
dist1 = point_dist(beacons[2][0],beacons[2][1],intersection[0][0],intersection[0][1])
dist2 = point_dist(beacons[2][0],beacons[2][1],intersection[1][0],intersection[1][1])
if ( math.fabs(dist1-beacons[2][2]) < math.fabs(dist2-beacons[2][2]) ):
best_point = intersection[0]
best_dist = dist1
else:
best_point = intersection[1]
best_dist = dist2
best_dist_diff = math.fabs(best_dist-beacons[2][2])
if best_dist_diff < THRESHOLD*best_dist:
print best_point
else:
print 'Bad Beacon Data (Beacon2 distance to best point not within threshold)'
If you want to trust closer beacons more, you may want to calculate the intersection points between the two closest beacons and then use the farther beacon to tie-break. Keep in mind that almost anything you do with "confidence levels" for the individual measurements will be a hack at best. Since you will always be working with very bad data, you will defintiely need to loosen up the power_increases limit and threshold percentage.
You have 3 points : A(xA,yA,zA), B(xB,yB,zB) and C(xC,yC,zC), which respectively are approximately at dA, dB and dC from you goal point G(xG,yG,zG).
Let's say cA, cB and cC are the confidence rate ( 0 < cX <= 1 ) of each point.
Basically, you might take something really close to 1, like {0.95,0.97,0.99}.
If you don't know, try different coefficient depending of distance avg. If distance is really big, you're likely to be not very confident about it.
Here is the way i'll do it :
var sum = (cA*dA) + (cB*dB) + (cC*dC);
dA = cA*dA/sum;
dB = cB*dB/sum;
dC = cC*dC/sum;
xG = (xA*dA) + (xB*dB) + (xC*dC);
yG = (yA*dA) + (yB*dB) + (yC*dC);
xG = (zA*dA) + (zB*dB) + (zC*dC);
Basic, and not really smart but will do the job for some simple tasks.
EDIT
You can take any confidence coef you want in [0,inf[, but IMHO, restraining at [0,1] is a good idea to keep a realistic result.

Can I calculate a transformation matrix given a set of points?

I'm trying to deduct the 2D-transformation parameters from the result.
Given is a large number of samples in an unknown X-Y-coordinate system as well as their respective counterparts in WGS84 (longitude, latitude). Since the area is small, we can assume the target system to be flat, too.
Sadly I don't know which order of scale, rotate, translate was used, and I'm not even sure if there were 1 or 2 translations.
I tried to create a lengthy equation system, but that ended up too complex for me to handle. Basic geometry also failed me, as the order of transformations is unknown and I would have to check every possible combination order.
Is there a systematic approach to this problem?
Figuring out the scaling factor is easy, just choose any two points and find the distance between them in your X-Y space and your WGS84 space and the ratio of them is your scaling factor.
The rotations and translations is a little trickier, but not nearly as difficult when you learn that the result of applying any number of rotations or translations (in 2 dimensions only!) can be reduced to a single rotation about some unknown point by some unknown angle.
Suddenly you have N points to determine 3 unknowns, the axis of rotation (x and y coordinate) and the angle of rotation.
Calculating the rotation looks like this:
Pr = R*(Pxy - Paxis_xy) + Paxis_xy
Pr is your rotated point in X-Y space which then needs to be converted to WGS84 space (if the axes of your coordinate systems are different).
R is the familiar rotation matrix depending on your rotation angle.
Pxy is your unrotated point in X-Y space.
Paxis_xy is the axis of rotation in X-Y space.
To actually find the 3 unknowns, you need to un-scale your WGS84 points (or equivalently scale your X-Y points) by the scaling factor you found and shift your points so that the two coordinate systems have the same origin.
First, finding the angle of rotation: take two corresponding pairs of points P1, P1' and P2, P2' and write out
P1' = R(P1-A) + A
P2' = R(P2-A) + A
where I swapped A = Paxis_xy for brevity. Subtracting the two equations gives:
P2'-P1' = R(P2-P1)
B = R * C
Bx = cos(a) * Cx - sin(a) * Cy
By = cos(a) * Cx + sin(a) * Cy
By + Bx = 2 * cos(a) * Cx
(By + Bx) / (2 * Cx) = cos(a)
...
(By - Bx) / (2 * Cy) = sin(a)
a = atan2(sin(a), cos(a)) <-- to get the right quadrant
And you have your angle, you can also do a quick check that cos(a) * cos(a) + sin(a) * sin(a) == 1 to make sure either you got all the calculations correct or that your system really is an orientation-preserving isometry (consists only of translations and rotations).
Now that we know a we know R and so to find A we do:
P1` = R(P1-A) + A
P1' - R*P1 = (I-R)A
A = (inverse(I-R)) * (P1' - R*P1)
where the inversion of a 2x2 matrix is easy.
EDIT: There is an error in the above, or more specifically one case that needs to be treated separately.
There is one combination of translations and rotations that does not reduce to a single rotation and that is a single translation. You can think of it in terms of fixed points (how many points are unchanged after the operation).
A translation has no fixed points (all points are changed) and a rotation has 1 fixed point (the axis doesn't change). It turns out that two rotations leave 1 fixed point and a translation and a rotation leaves 1 fixed point, which (with a little proof that says the number of fixed points tells you the operation performed) is the reason that arbitrary combinations of these result in a single rotation.
What this means for you is that if your angle comes out as 0 then using the method above will give you A = 0 as well, which is likely incorrect. In this case you have to do A = P1' - P1.
If I understood the question correctly, you have n points (X1,Y1),...,(Xn,Yn), the corresponding points, say, (x1,y1),...,(xn,yn) in another coordinate system, and the former are supposedly obtained from the latter by rotation, scaling and translation.
Note that this data does not determine the fixed point of rotation / scaling, or the order in which the operations "should" be applied. On the other hand, if you know these beforehand or choose them arbitrarily, you will find a rotation, translation and scaling factor that transform the data as supposed to.
For example, you can pick an any point, say, p0 = [X1, Y1]T (column vector) as the fixed point of rotation & scaling and subtract its coordinates from those of two other points to get p2 = [X2-X1, Y2-Y1]T, and p3 = [X3-X1, Y3-Y1]T. Also take the column vectors q2 = [x2-x1, y2-y1]T, q3 = [x3-x1, y3-y1]T. Now [p2 p3] = A*[q2 q3], where A is an unknwon 2x2 matrix representing the roto-scaling. You can solve it (unless you were unlucky and chose degenerate points) as A = [p2 p3] * [q2 q3]-1 where -1 denotes matrix inverse (of the 2x2 matrix [q2 q3]). Now, if the transformation between the coordinate systems really is a roto-scaling-translation, all the points should satisfy Pk = A * (Qk-q0) + p0, where Pk = [Xk, Yk]T, Qk = [xk, yk]T, q0=[x1, y1]T, and k=1,..,n.
If you want, you can quite easily determine the scaling and rotation parameter from the components of A or combine b = -A * q0 + p0 to get Pk = A*Qk + b.
The above method does not react well to noise or choosing degenerate points. If necessary, this can be fixed by applying, e.g., Principal Component Analysis, which is also just a few lines of code if MATLAB or some other linear algebra tools are available.

Algorithm to calculate the distances between many geo points

I have a matrix having around 1000 geospatial points(Longitude, Latitude) and i am trying to find the points that are in 1KM range.
NOTE: "The points are dynamic, Imagine 1000 vehicles are moving, so i have to re-calculate all distances every few seconds"
I did some searches and read about Graph algorithms like (Floyd–Warshall) to solve this, and I ended up with many keywords, and i am kinda lost now. I am considering the performance and since the search radius is short, I will not consider the curvature of the earth.
Basically, It appears that i have to calculate the distance between every point to every other point then sort the distances starting from every point in the matrix and get the points that are in its range. So if I have 1000 co-ordinates, I have to perfom this process (1000^2-1000) times and I do not beleive this is the optimum solution. Thank You.
If you make a modell with a grid of 1km spacing:
0 1 2 3
___|____|____|____
0 | | |
c| b|a | d
___|____|____|____
1 | | |
| |f |
___|e___|____|____
2 | |g |
let's assume your starting point is a.
If your grid is of 1km size, points in 1km reach have to be in the same cell or one of the 8 neighbours (Points b, d, e, f).
Every other cell can be ignored (c,g).
While d is nearly of the same distance to a as c, c can be dropped early, because there are 2 barriers to cross, while a and d lie on opposite areas of their border, and are therefore nearly 2 km away from each other.
For early dropping of element, you can exclude, it is enough to check the x- or y-part of the coordinate. Since a belongs to (0,2), if x is 0 or smaller, or > 3, the point is already out of range.
After filtering only few candidates, you may use exhaustive search.
In your case, you should be looking at the GeoHash which allows you to quickly query the coordinates within a given distance.
FYI, MongoDB uses geohash internally and it's performing excellently.
Try with an R-Tree. The R-Tree supports the operation to find all the points closest to a given point that are not further away than a given radius. The execution time is optimal and I think it's O(number_of_points_in_the_result).
You could compute geocodes of 1km range around each of those 1000 coordinates and check, whether some points are in that range. May be it's not optimum, but you will save yourself some sorting.
If you want to lookup the matrix for each point vs. each point then you already got the right formula (1000^2-1000). There isn't any shortcut for this calculation. However when you know where to start the search and you want look for points within a 1KM radius you can use a grid or spatial algorithm to speed up the lookup. Most likely it's uses a divide and conquer algorithm and the cheapest of it is a geohash or a z curve. You can also try a kd-tree. Maybe this is even simpler. But if your points are in euklidian space then there is this planar method describe here: http://en.wikipedia.org/wiki/Closest_pair_of_points_problem.
Edit: When I say 1000^2-1000 then I mean the size of the grid but it's actually 1000^(1000 − 1) / 2 pairs of points so a lot less math.
I have something sort of similar on a web page I worked on, I think. The user clicks a location on the map and enters a radius, and a function returns all the locations within a database within the given radius. Do you mean you are trying to find the points that are within 1km of one of the points in the radius? Or are you trying to find the points that are within 1km of each other? I think you should do something like this.
radius = given radius
x1 = latitude of given point;
y1 = longitude of given point;
x2 = null;
y2 = null;
x = null;
y = null;
dist = null;
for ( i=0; i<locationArray.length; i++ ) {
x2 = locationArray[i].latitude;
y2 = locationArray[i].longitude;
x = x1 - x2;
y = y1 - y2;
dist = sqrt(x^2 + y^2);
if (dist <= radius)
these are your points
}
If you are trying to calculate all of the points that are within 1km of another point, you could add an outer loop giving the information of x1 and y1, which would then make the inner loop test the distance between the given point and every other point giving every point in your matrix as input. The calculations shouldn't take too long, since it is so basic.
I had the same problem but in a web service development
In my case to avoid the calculation time problem i used a simple divide & conquer solution : The idea was start the calculation of the distance between the new point and the others in every new data insertion, so that my application access directly the distance between those tow points that had been already calculated and put in my database

Resources