I was asked following question in interview recently:
Let suppose you have, following grid on Cartesian coordinate system ( Quadrant I).
o - x - x - x - o
| | | | |
x - x - x - o - x
| | | | |
x - o - o - x - x
where, o => person at intersection and x => no person at intersection
class Point {
int x;
int y;
boolean hasPerson;
}
Point findNearestPointWhereAllPeopleCanMeet(Point[] people) {
}
Implement a routine where given a list of people location (points), find a location(point) such that is closest point to all given point.
How would you solve this problem ?
1) Approach is k-d tree, but do you know any other solution ?
If the problem calls for minimizing the Manhattan distance (that is, people only walk parallel to the axes), then the problem is then pretty trivial. First, selecting the x coordinate and the y coordinate are independent problems.
Then, for the each coordinate, simply find the median value of the position of the people along that axis. For many configurations of people, there can be more than one point that minimizes the sum of the walking distances of all people. (Just consider 2 people separated by more than 2 blocks in x and at the same y coordinate; any intersection in between will require the same total walking by the two people.)
If the problem calls for minimizing the Euclidean distance, then the goal is to find the 2-variable L1 median. This is a standard problem, but it is far from trivial. (See here, for instance.) There is a unique answer. Given that this was an interview question, I suspect that this does not apply.
Before going to study k-d trees these are some thoughts that came to my mind:
Iterate every point of your matrix, graph, or whatever it is
Assign to each point (x,y) a value representing the MAX_distance from the Point to any of the people. (See a clarification example below)
Take any of the points that have the lowest MAX_distance
E.G. Given Point(0,0):
For (1,0) the distance is: 1
For (2,0) the distance is: 2
For (0,2) the distance is: 2
For (3,1) the distance is: 4
For (4,2) the distance is: 6
The MAX_distance for (0,0) is 6. Given your input the lowest MAX_distance should be 3 and there are many Points with that value like (2,1) for instance.
There should be ways to make this algorithm more efficient.. Maybe with k-d trees :p or with other tweaks like checking the total number of people, their relative location/distance, the MAX_distance value at any iteration, etc..
This will probably give you more of an approximate than the correct answer.
But maybe you can try some sort of clustering (see medical image processing)
What if you project all points onto the Y axis:
3*
4*
3*
Then project onto the X axis:
2* 2* 2* 2* 2*
Legend: 3* means 3 people at this coordinate on the axis
Now find the median also using the weights (weight #location = how many people at that location on axis)
If you find the median for both axis then you could take the meeting points as (medianX, medianY).
You could get the correct closest point if when you calculate median on one axis, you also make sure to minimize distance by calculating the median of the other axis. This latter case is harder.
Related
The problem is as it follows:
Given N (N <= 100,000) points by their Cartesian coordinates, test if every rectangle (with an area > 0) built using any two of them contains at least another point from that list either inside the rectangle or on his margin.
I know the algorithm for O(N^2) time complexity. I want to find the solution for O(N * logN). Memory is not a problem.
When I say two points define a rectangle, I mean that those two points are in two of its opposite corners. The sides of the rectangles are parallel to the Cartesian axes.
Here are two examples:
N = 4
Points:
1 1
3 5
2 4
8 8
INCORRECT: (the rectangle (1, 1) - (2, 4) doesn't have any point inside of it or on its margin).
N = 3
Points:
10 9
13 9
10 8
CORRECT.
Sort the points in order of the max of their coordinate pairs, from lowest to highest (O(n*log(n))). Starting with the lower-left point (lowest max coordinate), if the next point in the ordering does not share either the original point's x-value or its y-value (e.g. (1,2) and (5,2) share a y-coordinate of 2, but (1,2) and (2, 1) have neither coordinate in common), the set of points fails the test. Otherwise, move to the next point. If you reach the final point in this way (O(n)), the set is valid. The algorithm overall is O(n*log(n)).
Explanation
Two points that share a coordinate value lie along a line parallel to one of the axes, so there is no rectangle drawn between them (since such a rectangle would have area=0).
For a given point p1, if the next "bigger" point in the ordering, p2, is directly vertical or horizontal from p1 (i.e. it shares a coordinate value), then all points to the upper-right of p2 form rectangles with p1 that include p2, so there are no rectangles in the set that have p1 as the lower-left corner and lack an internal point.
If, however, the next-bigger point is diagonal from p2, then the p1 <-> p2 rectangle has no points from the set inside it, so the set is invalid.
For every point P = (a, b) in the set, search the nearest points of the form Y = (x, b) and X = (a, y) such that x > a and y > b.
Then, find if the rectangle defined by the two points X, Y contains any* internal point R besides P, X and Y. If that is the case, it's easy to see that the rectangle P, R does not contain any other point in the set.
If no point in the set exists matching the restrictions for X or Y, then you have to use (a, ∞) or (∞, b) respectively instead.
The computational cost of the algorithm is O(NlogN):
Looking for X or Y can be done using binary search [O(logN)] over a presorted list [O(NlogN)].
Looking for R can be done using some spatial tree structure as a quadtree or a k-d tree [O(logN)].
*) If X, Y contains more than one internal point, R should be selected as the nearest to P.
Update: the algorithm above works for rectangles defined by its botton-left and upper-right corners. In order to make it work also for rectangles defined by its botton-right and upper-left corners, a new point X' (such that it is the nearest one to P of the form X' = (a, y') where y' < b) and the corresponding rectangle defined by X', Y should also be considered for every point in the set.
I have a matrix having around 1000 geospatial points(Longitude, Latitude) and i am trying to find the points that are in 1KM range.
NOTE: "The points are dynamic, Imagine 1000 vehicles are moving, so i have to re-calculate all distances every few seconds"
I did some searches and read about Graph algorithms like (Floyd–Warshall) to solve this, and I ended up with many keywords, and i am kinda lost now. I am considering the performance and since the search radius is short, I will not consider the curvature of the earth.
Basically, It appears that i have to calculate the distance between every point to every other point then sort the distances starting from every point in the matrix and get the points that are in its range. So if I have 1000 co-ordinates, I have to perfom this process (1000^2-1000) times and I do not beleive this is the optimum solution. Thank You.
If you make a modell with a grid of 1km spacing:
0 1 2 3
___|____|____|____
0 | | |
c| b|a | d
___|____|____|____
1 | | |
| |f |
___|e___|____|____
2 | |g |
let's assume your starting point is a.
If your grid is of 1km size, points in 1km reach have to be in the same cell or one of the 8 neighbours (Points b, d, e, f).
Every other cell can be ignored (c,g).
While d is nearly of the same distance to a as c, c can be dropped early, because there are 2 barriers to cross, while a and d lie on opposite areas of their border, and are therefore nearly 2 km away from each other.
For early dropping of element, you can exclude, it is enough to check the x- or y-part of the coordinate. Since a belongs to (0,2), if x is 0 or smaller, or > 3, the point is already out of range.
After filtering only few candidates, you may use exhaustive search.
In your case, you should be looking at the GeoHash which allows you to quickly query the coordinates within a given distance.
FYI, MongoDB uses geohash internally and it's performing excellently.
Try with an R-Tree. The R-Tree supports the operation to find all the points closest to a given point that are not further away than a given radius. The execution time is optimal and I think it's O(number_of_points_in_the_result).
You could compute geocodes of 1km range around each of those 1000 coordinates and check, whether some points are in that range. May be it's not optimum, but you will save yourself some sorting.
If you want to lookup the matrix for each point vs. each point then you already got the right formula (1000^2-1000). There isn't any shortcut for this calculation. However when you know where to start the search and you want look for points within a 1KM radius you can use a grid or spatial algorithm to speed up the lookup. Most likely it's uses a divide and conquer algorithm and the cheapest of it is a geohash or a z curve. You can also try a kd-tree. Maybe this is even simpler. But if your points are in euklidian space then there is this planar method describe here: http://en.wikipedia.org/wiki/Closest_pair_of_points_problem.
Edit: When I say 1000^2-1000 then I mean the size of the grid but it's actually 1000^(1000 − 1) / 2 pairs of points so a lot less math.
I have something sort of similar on a web page I worked on, I think. The user clicks a location on the map and enters a radius, and a function returns all the locations within a database within the given radius. Do you mean you are trying to find the points that are within 1km of one of the points in the radius? Or are you trying to find the points that are within 1km of each other? I think you should do something like this.
radius = given radius
x1 = latitude of given point;
y1 = longitude of given point;
x2 = null;
y2 = null;
x = null;
y = null;
dist = null;
for ( i=0; i<locationArray.length; i++ ) {
x2 = locationArray[i].latitude;
y2 = locationArray[i].longitude;
x = x1 - x2;
y = y1 - y2;
dist = sqrt(x^2 + y^2);
if (dist <= radius)
these are your points
}
If you are trying to calculate all of the points that are within 1km of another point, you could add an outer loop giving the information of x1 and y1, which would then make the inner loop test the distance between the given point and every other point giving every point in your matrix as input. The calculations shouldn't take too long, since it is so basic.
I had the same problem but in a web service development
In my case to avoid the calculation time problem i used a simple divide & conquer solution : The idea was start the calculation of the distance between the new point and the others in every new data insertion, so that my application access directly the distance between those tow points that had been already calculated and put in my database
Generally we traverse the array by row or column but here I want to traverse it in an angle.
I will try and explain what I mean,
So lets say if the angle is 45 degree then rather than row by col it would search as (0,0) then (0,1) (1,0) then (0,2) , (1,1) ,(2,0) and so on.. .(sorry could not upload an image as I am new user and not allowed to do so, may be try and imagine/draw an array that would help get what I am trying to say)
But what will happen if the user inputs an angle like 20 degree how can we determine how to search the array.
i just wanted to know if there is any algorithm which does something similar to this? Programming language is not an issue i guess the issue is more of algoritham sort.
Any ideas would be welcome.
Please feel free to ask if I am not able to explain clearly what I am looking for.
Thanks guys.
Easy. Take an angle (let's say 45). This corresponds to a vector v=(1, 1) in your case. (This can be normalized to a unitary vector (sqrt(2)/2, sqrt(2)/2), but this is not necessary)
For every single point in your array, you have their coordinates (x, y). Simply do the scalar product of these coordinates with the vector. Let's call f(x, y) = scalarProduct((x, y), v)
Sort the values of f(x, y) and you've got the "traversing" you're looking for!
A real example.
Your matrix is 3x3
The scalar products are :
(0,0).(1,1) = 0
(0,1).(1,1) = 1
(0,2).(1,1) = 2
(1,0).(1,1) = 1
(1,1).(1,1) = 2
(1,2).(1,1) = 3
(2,0).(1,1) = 2
(2,1).(1,1) = 3
(2,2).(1,1) = 4
If you order these scalar products by ascending order, you obtain the ordering (0,0), (1,0), (1,0), (2,0), (1,1), (0,2), (2,1)...
And if you want to do it with the angle 20, replace all occurences of v=(1, 1) with v=(cos(20), sin(20))
Here's an illustration of a geometrical interpretation. The scalar products correspond to the intersections of the vector v (in red) with the blue lines.
For every starting point (the leftmost point of every row), use trigonometry to determine an ending point for the given angle. The tan(angle) is defined as (height difference / width of the array), so your height differece is tan(angle)*(witdh of the array). You only have to calculate the height difference once. If y+height difference is greater than the height of the array, just subtract the height (or use the modulo operator).
Now that you have a starting point and an ending point you could use Bresenham's Algorithm to determine the points in between: http://en.wikipedia.org/wiki/Bresenham%27s_line_algorithm
You want to look for a space-filling-curve for example a morton curve or z-curve. If you want to subdivide the array in 4 tiles you may want to look for a hilbert curve or a moore curve.
I have few Cartesian points of the form : (x,y)
where x and y both are non-negative integers.
For e.g.
(0,0) , (1,1), (0,1)
I need an algorithm to arrange the above points
in such a way that going from one point to other
changes either x or y by 1.
In other words, I would like to avoid
diagonal movement.
So, the above mentioned points will be arranged like :
(0,0), (0,1), (1,1).
Similarly for (0,0),(1,1),(0,2)
there is no such arrangement possible.
I am not sure about what to call it
but I would call it Manhattan ordering.
Can anyone help?
If you are looking for an arrangement that does not repeat vertices:
What you seem to be looking for is a Hamiltonian Path in a Grid Graph.
This is known to be NP-Complete for general Grid Graphs, see Hamilton Paths in Grid Graphs.
So you can probably try your luck with any of the approximate/heuristic/etc algorithms known for Hamiltonian Path/Euclidean Traveling Salesman Problem.
If you are looking for an arrangement that can repeat, but want the minimum possible number of points in the arrangement:
This is again NP-Complete. The above problem can be reduced to it. This is because the minimum possible walk has n vertices if and only if the graph has a hamiltonian path.
If you are just looking for some arrangement of points,
Then all you need to do is check if the graph is connected. If it is not connected, there can be no such arrangement.
You can do a depth first search to figure that out. The depth first search will also give you such an arrangement in case the graph is connected.
If you want something closer to optimal (but in reasonably fast time), you can probably use approximation algorithms for the Euclidean Traveling Salesman problem.
You can construct a graph with the vertices being your points, and the edges being the valid steps.
What you're then looking for is a Hamiltonian path for this graph. This, in its general form, is an NP-complete problem, which means there is no known efficient solution (i.e. one that scales well with the number of points). Wikipedia describes a randomized algorithm that is "fast on most graphs" and might be of use:
Start from a random vertex, and continue if there is a neighbor not visited. If there are no more unvisited neighbors, and the path formed isn't Hamiltonian, pick a neighbor uniformly at random, and rotate using that neighbor as a pivot. (That is, add an edge to that neighbor, and remove one of the existing edges from that neighbor so as not to form a loop.) Then, continue the algorithm at the new end of the path.
A more efficient solution might exist for this particular situation, though.
Think of it as a graph where each node as at most four edges. Then do depth/breadth-first search.
This could be simplified as minimizing the distance between each consecutive point. Going from (0,0) to (0,1) is simply 1 unit, but going from (0,0) to (1,1) is actually sqrt(2). So if you arrange the points into a graph, and then perform a simple minimum-total-distance traversal (traveling salesman), it should arrange them correctly.
Edit: If you never want a step that would be larger than 1, simply do not add any edges that are greater than 1. The traversal will still work correctly, and ignore any paths that require a movement > 1.
Edit 2: To clarify further, you can use any edge selection algorithm you wish. If you're ok with it moving 2 spaces, as long as the space is not diagonal, then you may choose to put an edge between (0,2) and (0,4). The minimum distance algorithm will still work. Simply place the edges in a smart way, and you can use any number of selection criteria to determine the outcome.
One way to do this is to create two sorted lists of the coordinates. One sorted by x and one sorted by y. Then, recursively find a solution.
Code coming (not sure what language yet; maybe pseudocode?)... Edit - nevermind, since my answer isn't as good as some of the others anyways.
How about a brute force recursive REXX routine... Try all possible
paths and print those that work out.
/* rexx */
point. = 0 /* Boolean for each existing point */
say 'Enter origin x and y coordinate:'
pull xo yo
point.xo.yo = 1 /* Point exists... */
say 'Enter destination x and y coordinate:'
pull xd yd
point.xd.yd = 1 /* Point exists... */
say 'Enter remaining x and y coordinates, one pair per line:'
do forever
pull x y
if x = '' then leave
point.x.y = 1
end
path = ''
call findpath xo yo path
say 'All possible paths have been displayed'
return
findpath: procedure expose point. xd yd
arg x y path
if \point.x.y then return /* no such point */
newpoint = '(' || x y || ')'
if pos(newpoint, path) > 0 then return /* abandon on cycle */
if x = xd & y = yd then /* found a path */
say path newpoint
else do /* keep searching */
call findpath x+1 y path newpoint
call findpath x-1 y path newpoint
call findpath x y+1 path newpoint
call findpath x y-1 path newpoint
end
return
Example session:
Path.rex
Enter origin x and y coordinate:
0 0
Enter destination x and y coordinate:
2 2
Enter remaining x and y coordinates, one pair per line:
0 1
1 1
2 1
1 2
(0 0) (0 1) (1 1) (2 1) (2 2)
(0 0) (0 1) (1 1) (1 2) (2 2)
All possible paths have been displayed
Don't try this on anything big though - could take a very long time! But then the question never mentioned anything about being an optimal solution.
Given n squares with edge length l, how can I determine the minimum radius r of the circle so that I can distribute all squares evenly along the perimeter of the circle without them overlapping? (Constraint: the first square will always be positioned at 12 o'clock.)
Followup question: how can I place n identical rectangles with height h and width w?
(source: n3rd.org)
There may be a mathematically clever way to do this, but I wouldn't know.
I think it's complicated a bit by the fact that the geometry is different for every different number of squares; for 4 it's a rhombus, for 5 it's a pentagon and so on.
What I'd do is place those squares on a 1 unit circle (much too small, I know, bear with me) distributed equally on it. That's easy enough, just subtend (divide) your 360 degrees by the number of squares. Then just test all your squares for overlap against their neighbors; if they overlap, increase the radius.
You can make this procedure less stupid than it sounds by using an intelligent algorithm to approach the right size. I'm thinking of something like Newton's algorithm: Given two successive guesses, of which one is too small and one is too big, your next guess needs to be the average of those two.
You can iterate down to any precision you like. Stop whenever the distance between guesses is smaller than some arbitrary small margin of error.
EDIT I have a better solution:
I was thinking about what to tell you if you asked "how will I know if squares overlap?" This gave me an idea on how to calculate the circle size exactly, in one step:
Place your squares on a much-too-small circle. You know how: Calculate the points on the circle where your 360/n angles intersect it, and put the center of the square there. Actually, you don't need to place squares yet, the next steps only require midpoints.
To calculate the minimum distance of a square to its neighbor: Calculate the difference in X and the difference in Y of the midpoints, and take the minimum of those. The X's and Y's are actually just cosines and sines on the circle.
You'll want the minimum of any square against its neighbor (clockwise, say). So you need to work your way around the circle to find the very smallest one.
The minimum (X or Y) distance between the squares needs to become 1.0 . So just take the reciprocal of the minimum distance and multiply the circle's size by that. Presto, your circle is the right size.
EDIT
Without losing generality, I think it's possible to nail my solution down a bit so it's close to coding. Here's a refinement:
Assume the squares have size 1, i.e. each side has a length of 1 unit. In the end, your boxes will surely be larger than 1 pixel but it's just a matter of scaling.
Get rid of the corner cases:
if (n < 2) throw new IllegalArgumentException();
if (n == 2) return 0.5; // 2 squares will fit exactly on a circle of radius 0.5
Start with a circle size r of 0.5, which will surely be too small for any number of squares > 2.
r = 0.5;
dmin = 1.0; // start assuming minimum distance is fine
a = 2 * PI / n;
for (p1 = 0.0; p1 <= PI; p1+=a) { // starting with angle 0, try all points till halfway around
// (yeah, we're starting east, not north. doesn't matter)
p2 = p1 + a; // next point on the circle
dx = abs(r * cos(p2) - r * cos(p1))
dy = abs(r * sin(p2) - r * sin(p1))
dmin = min(dmin, dx, dy)
}
r = r / dmin;
EDIT
I turned this into real Java code and got something quite similar to this to run. Code and results here: http://ideone.com/r9aiu
I created graphical output using GnuPlot. I was able to create simple diagrams of boxes arranged in a circle by cut-and-pasting the point sets from the output into a data file and then running
plot '5.dat' with boxxyerrorbars
The .5's in the file serve to size the boxes... lazy but working solution. The .5 is applied to both sides of the center, so the boxes end up being exactly 1.0 in size.
Alas, my algorithm doesn't work. It makes the radii far too large, thus placing the boxes much further apart than necessary. Even scaling down by a factor of 2 (could have been a mistake to use 0.5 in some places) didn't help.
Sorry, I give up. Maybe my approach can be salvaged, but it doesn't work the way I had though it would. :(
EDIT
I hate giving up. I was about to leave my PC when I thought of a way to salvage my algorithm:
The algorithm was adjusting the smaller of the X or Y distances to be at least 1. It's easy to demonstrate that's just plain silly. When you have a lot of boxes then at the eastern and western edges of the circle you have boxes stacked almost directly on top of each other, with their X's very close to one another but they are saved from touching by having just enough Y distance between them.
So... to make this work, you must scale the maximum of dx and dy to be (for all cases) at least the radius (or was it double the radius?).
Corrected code is here: http://ideone.com/EQ03g http://ideone.com/VRyyo
Tested again in GnuPlot, it produces beautiful little circles of boxes where sometimes just 1 or 2 boxes are exactly touching. Problem solved! :)
(These images are wider than they are tall because GnuPlot didn't know I wanted proportional layout. Just imagine the whole works squeezed into a square shape :) )
I would calculate an upper bound of the minimum radius, by working with circles enclosing the squares instead of with the squares themselves.
My calculation results in:
Rmin <= X / (sqrt(2) * sin (180/N) )
Where:
X is the square side length, and N is the required number of squares.
I assume that the circles are positioned such that their centers fall on the big circle's circumference.
-- EDIT --
Using the idea of Dave in the comment below, we can also calculate a nice lower bound, by considering the circles to be inside the squares (thus having radius X/2). This bound is:
Rmin >= X / (2 * sin (180/N) )
As already noted, the problem of positioning n points equally spaced round the circumference of a circle is trivial. The (not-terribly) difficult part of the problem is to figure out the radius of the circle needed to give a pleasing layout of the squares. I suggest you follow one of the other answers and think of the squares being inside a circular 'buffer' big enough to contain the square and enough space to satisfy your aesthetic requirements. Then check the formula for the chord length between the centres of neighbouring squares. Now you have the angle, at the centre of the circle, subtended by the chord between square centres, and can easily compute the radius of the circle from the trigonometry of a triangle.
And, as to your follow up question: I suggest that you work out the problem for squares of side length min(h,w) on a circle, then transform the squares to rectangles and the circle to an ellipse with eccentricity h/w (or w/h).
I would solve it like this:
To find the relation between the radius r and length l let's analyze dimensionless representation
get the centres on a circle (x1,y1)..(xn,yn)
from each center get lower right corner of the i-th square and upper left corner of the i+1-th square
the two points should either have equal x or equal y, whichever yields smaller l
procedure should be repeated for each center and the one that yields smallest l is the final solution.
This is the optimal solution and can be solved it terms of r = f(l).
The solution can be adapted to rectangles by adjusting the formula for xLR[i] and yUL[i+1].
Will try to give some pseudo code.
EDIT:
There's a bug in the procedure, lower right and upper left are not necessary closest points for two neighbouring squares/rectangles.
Let's assume you solved the problem for 3 or 4 squares.
If you have n >= 5 squares, and position one square at the top of the circle, you'll have another square fall into the first quadrant of a cartesian plane concentric with your circle.
The problem is then to find a radius r for the circle such that the left side of the circle next to the top one, and the right side of the top circle do not 'cross' each other.
The x coordinate of the right side of the top circle is x1 = L/2, where L is the side of a square. The x coordinate of the left side of the circle next to the top one is x2 = r cos a - L/2, where r is the radius and a is the angle between each pair of square centres (a = 360/n degrees).
So we need to solve x1 <= x2, which leads to
r >= L / cos a.
L and a are known, so we're done :-)
You start with an arbitrary circle (e.g., with a diameter of (* n l)) and position the squares evenly on the circumference. Then you go through each pair of adjacent squares and:
calculate the straight line connecting their mid points,
calculate the intersection of this line with the intervening square sides (M1 and M2 are the mid points, S1 and S2 the corresponding intersections with the square side:
S2 S1
M1--------------*----------*---------------M2
------------------------
| |
| |
| |
| |
| M1 |
| \ |
| \ |
| -------*------- +--------
| | \ | |
| | \ | |
-------+---------*------ |
| \ |
| M2 |
| |
| |
| |
| |
-------------------------
calculate the scale factor you would need to make S1 and S2 fall together (simply the ratio of the sum of M1-S1 and S2-M2 to M1-M2), and
finally scale the circle by the maximum of the found scale factors.
Edit: This is the exact solution. However, a little thought can optimize this further for speed:
You only need to do this for the squares closest to 45° (if n is even) resp. 45° and 135° (if n is odd; actually, you might prove that only one of these is necessary).
For large n, the optimal spacing of the squares on the circle will quickly approach the length of a diagonal of a square. You could thus precompute the scaling factors for a few small n (up to a dozen or so), and then have a good enough approximation with the diagonal.