How to reconstruct a road network from gps coordinates - algorithm

I am trying to reconstruct a road network from a set of GPS coordinates.
I have done some research, but most existing algorithms seem to rely on having information about which coordinates are from the same car. I do not have any information on which points lie on a trace together. As a first step I am assuming that the coordinates are 100% accurate, to make it simpler. I realise that adding points at cross sections will be necessary to ensure there are no intersections. I also assume there are no roads going over or under each other.
So what I have: A set of points in a 2D plane.
What I need to compute: A fully connected network that connects all of these points. This should be the most likely road network.
Does anyone have any thoughts on how to go about doing this?
I thought about starting off with a minimum spanning tree and going from there. But I have no idea what to do next.

I'm assuming you have no information about the point. So dependent on your dataset, you could bruteforce it.
Start with a random point, find the closest point to this, connect these. Move to the new point, and find the closest point not connected to this. And then spread out from there.
You probably need to do some gymnastics to some cases, like endpoints, or when there are a long distance between two points that should be connected.
I'll try to give some examples when I get home later today, if you need it.

What I need to compute: A fully connected network that connects all of these points. This should be the most likely road network.
Does anyone have any thoughts on how to go about doing this?
Not at all. It's impossible, unless you make a lot more statements about your individual positions.
So what I have: A set of points in a 2D plane.
Look at it as a graph problem: You've got a bunch of nodes (your GPS coordinates) with no edges. So that's a graph, yes, but a totally disconnected one, hence:
I thought about starting off with a minimum spanning tree and going from there.
Citing Wikipedia:
A minimum spanning tree is a spanning tree of a connected, undirected graph. It connects all the vertices together with the minimal total weighting for its edges.
That would require having a connected graph. You don't have one.
So how would you start connecting your points? You only have a set of points; a set as it isn't even ordered! Will you connect points to their nearest neighbor, ensuring some sort of restriction on the edges per node? Won't work in urban areas at all – there's main roads that are broader than the distance between back alleys, for example, under- and overpasses, a lot of streets that have lanes going in incompatible directions etc.

Related

Merge adjacent vertices of a graph until single vertex left in the fewest steps possible

I have a game system that can be represented as an undirected, unweighted graph where each vertex has one (relevant) property: a color. The goal of the game in terms of the graph representation is to reduce it down to one vertex in the fewest "steps" possible. In each step, the player can change the color of any one vertex, and all adjacent vertices of the same color are merged with it. (Note that in the example below I just happened to show the user only changing one specific vertex the whole game, but the user can pick any vertex in each step.)
What I am after is a way to compute the fewest amount of steps necessary to "beat" a given graph per the procedure described above, and also provide the specific moves needed to do so. I'm familiar with the basics of path-finding, BFS, and things of that nature, but I'm having a hard time framing this problem in terms of a "fastest path" solution.
I am unable to find this same problem anywhere on Google, or even a graph-theory term that encapsulates the problem. Does anyone have an idea of at least how to get started approaching this problem? Can anyone point me in the right direction?
EDIT Since this problem seems to be really difficult to solve efficiently, perhaps I could change the aim of my question. Could someone describe how I would even set up a brute force, breadth first search for this? (Brute force could possibly be okay, since in practice these graphs will only be 20 vertices at most.) I know how to write a BFS for a normal linked graph data structure... but in this case it seems quite weird since each vertex would have to contain a whole graph within itself, and the next vertices in the search graph would have to be generated based on possible moves to make in the graph within the vertex. How would one setup the data structure and search algorithm to accomplish this?
EDIT 2 This is an old question, but I figured it might help to just state outright what the game was. The game was essentially to be a rip-off of Kami 2 for iOS, except my custom puzzle editor would automatically figure out the quickest possible way to solve your puzzle, instead of having to find the shortest move number by trial and error yourself. I'm not sure if Kami was a completely original game concept, or if there is a whole class of games like it with the same "flood-fill" mechanic that I'm unaware of. If this is a common type of game, perhaps knowing the name of it could allow finding more literature on the algorithm I'm seeking.
EDIT 3 This Stack Overflow question seems like it may have some relevant insights.
Intuitively, the solution seems global. If you take a larger graph, for example, which dot you select first will have an impact on the direct neighbours which will have an impact on their neighbours and so on.
It sounds as if it were of the same breed of problems as the map colouring problem. Not because of the colours but because of the implications of a local selection to the other end of the graph down the road. In the map colouring, you have to decide what colour to draw a country and its neighbouring countries so two countries that touch don't have the same colour. That first set of selections have an impact on whether there is a solution in the subsequent iterations.
Just to show how complex problem is.
Lets check simpler problem where graph is changed with a tree, and only root vertex can change a colour. In that case path to a leaf can be represented as a sequence of colours of vertices on that path. Sequence A of colour changes collapses a leaf if leaf's sequence is subsequence of A.
Problem can be stated that for given set of sequences problem is to find minimal length sequence (S) so that each initial sequence is contained in S. That is called shortest common supersequence problem, and it is NP-complete.
Your problem is for sure more complex than this one :-/
Edit *
This is a comment on question's edit. Check this page for a terms.
Number of minimal possible moves is >= than graph radius. With that it seems good strategy to:
use central vertices for moves,
use moves that reduce graph radius, or at least reduce distance from central vertices to 'large' set of vertices.
I would go with a strategy that keeps track of central vertices and distances of all graph vertices to these central vertices. Step is to check all meaningful moves and choose one that reduce radius or distance to central vertices the most. I think BFS can be used for distance calculation and how move influences them. There are tricky parts, like when central vertices changes after moves. Maybe it is good idea to use not only central vertices but also vertices close to central.
I think the graph term you are looking for is the "valence" of a graph, which is the number of edges that a node is connected to. It looks like you want to change the color based on what node has the highest valence. Then in the resulting graph change the color for the node that has the highest valence, etc. until you have just one node left.

Identification of closed cells in a graph

I have a graph where nodes represent points in 3D space. Each node is connected only to all other nodes within some cutoff radius. I am trying to enumerate all subgraphs such that the nodes represent the vertices of a polyhedron with no interior nodes or edges.
At first I thought this was clique problem, but the requirement that all nodes are adjacent to each other isn't working for me. The opposite corners of a cube aren't going to be connected in my dataset, but I need to be able to pull out the cube.
I don't really have a formal education in CS, so I'm not really sure what to search for, but hopefully someone with a better domain vocabulary than me can point me in the right direction.
One way to attack this problem seem to me to use a 3D triangulation "tetrahedrisation". CGAL can be of service there. Then you can start generating different polyhedrons by sticking together neighboring tetrahedrons from the triangulation (as long as they don't inclose another vertex in their combined interior).

Finding the starting vertex for Dijkstra's algorithm?

Imagine I am implementing Dijkstra's algorithm at a park. There are points and connections between those points; these specify valid paths the user can walk on (e.g. sidewalks).
Now imagine that the user is on the grass (i.e. not on a path) and wants to navigate to another location. The problem is not in Dijkstra's algorithm (which works fine), the problem is determining at which vertex to begin.
Here is a picture of the problem: (ignore the dotted lines for now)
Black lines show the edges in Dijkstra's algorithm; likewise, purple circles show the vertices. Sidewalks are in gray. The grass is, you guessed it, green. The user is located at the red star, and wants to get to the orange X.
If I naively look for the nearest vertex and use that as my starting point, the user is often directed to a suboptimal path, that involves walking further away from their destination at the start (i.e. the red solid path).
The blue solid path is the optimal path that my algorithm would ideally come up with.
Notes:
Assume no paths cross over other paths.
When navigating to a starting point, the user should never cross over a path (e.g. sidewalk).
In the image above, the first line segment coming out of the star is created dynamically, simply to assist the user. The star is not a vertex in the graph (since the user can be anywhere inside the grass region). The line segment from the star to a vertex is simply being displayed so that the user knows how to get to the first valid vertex in the graph.
How can I implement this efficiently and correctly?
Idea #1: Find the enclosing polygon
If I find the smallest polygon which surrounds my starting point, I can now create new paths for Dijkstra's algorithm from the starting point (which will be added as a new vertex temporarily) to each of the vertices that make up the polygon. In the example above, the polygon has 6 sides, so this would mean creating 6 new paths to each of its vertices (i.e. the blue dotted lines). I would then be able to run Dijkstra's algorithm and it would easily determine that the blue solid line is the optimal path.
The problem with this method is in determining which vertices comprise the smallest polygon that surrounds my point. I cannot create new paths to each vertex in the graph, otherwise I will end up with the red dotted lines as well, which completely defeats the purpose of using Dijkstra's algorithm (I should not be allowed to cross over a sidewalk). Therefore, I must take care to only create paths to the vertices of the enclosing polygon. Is there an algorithm for this?
There is another complication with this solution: imagine the user now starts at the purple lightning bolt. It has no enclosing polygon, yet the algorithm should still work by connecting it to the 3 points at the top right. Again, once it is connected to those, running Dijkstra's is easy.
Update: the reason we want to connect to one of these 3 points and not walk around everything to reach the orange X directly is because we want to minimize the walking done on unpaved paths. (Note: This is only a constraint if you start outside a polygon. We don't care how long you walk on the grass if it is within a polygon).
If this is the correct solution, then please post its algorithm as an answer.
Otherwise, please post a better solution.
You can start off by running Dijkstra from the target to find its distance to all vertices.
Now let's consider the case where you start "inside" the graph on the grass. We want to find all vertices that we can reach via a straight line without crossing any edge. For that we can throw together all the line segments representing the edges and the line segments connecting the start point to every vertex and use a sweep-line algorithm to find whether the start-vertex lines intersect any edge.
Alternatively you can use any offline algorithm for planar point location, those also work with a sweep line. I believe this is in the spirit of the more abstract algorithm proposed in the question in that it reports the polygon that surrounds the point.
Then we just need to find the vertex whose connection line to the start does not intersect any edge and the sum d(vertex, target) + d(vertex, start) is minimum.
The procedure when the vertex is outside the graph is somewhat underspecified, but I guess the exact same idea would work. Just keep in mind that there is the possibility to walk all around the graph to the target if it is on the border, like in your example.
This could probably be implemented in O((n+m) log m) per query. If you run an all-pairs shortest path algorithm as a preprocessing step and use an online point location algorithm, you can get logarithmic query time at the cost of the space necessary to store the information to speed up shortest path queries (quadratic if you just store all distance pairs).
I believe simple planar point location works just like the sweep line approaches, only with persistent BSTs to store all the sweepline states.
I'm not sure why you are a bothering with trying to find a starting vertex when you already have one. The point you (the user) are standing at is another vertex in of itself. So the real question now is to find the distance from your starting point to any other point in the enclosing polygon graph. And once you have that, you can simply run Dijkstra's or another shortest path algorithm method like A*, BFS, etc, to find the shortest path to your goal point.
On that note, I think you are better off implementing A* for this problem because a park involves things like trees, playgrounds, ponds (sometimes), etc. So you will need to use a shortest path algorithm that takes these into consideration, and A* is one algorithm that uses these factors to determine a path of shortest length.
Finding distance from start to graph:
The problem of finding the distance from your new vertex to other vertices can be done by only looking for points with the closest x or y coordinate to your start point. So this algorithm has to find points that form a sort of closure around the start point, i.e. a polygon of minimum area which contains the point. So as #Niklas B suggested, a planar point algorithm (with some modifications) might be able to accomplish this. I was looking at the sweep-line algorithm, but that only works for line segments so that will not work (still worth a shot, with modifications might be able to give the correct answer).
You can also decide to implement this algorithm in stages, so first, find the points with the closest y coordinate to the current point (Both negative and positive y, so have to use absolute value), then among those points, you find the ones with the closest x coordinate to the current point and that should give you the set of points that form the polygon. Then these are the points you use to find the distance from your start to the graph.

Distance between lots of points on a map

I have a 20,000 point array of gps locations.
They represent points on a forest path that need to be checked. I need to figure out how many km of forest path needs checking.
Group the points into routes.
Measure the shorted path of each route
Which algorithms should I consider and in which order.
Should I get the shortest path and break it up into routes or get the routes and then find the shorted path of each.
This solution asumes that you only have the points and don't know on which forest path th e points are, and in which order, etc.
I would try it this way:
1 connect each node with each other with a link, and as link weight use the distance (or better the number of seconds when going with 2km/h in meters in between the nodes: low speed asuming walking in the wood is slower then on a existimg forest road)
2 if the forest has diffuclties (mountains, vallley, river):
2a: ascent/descent: raise the link weight, using the altitudinal difference, look in outdoor planning resources, how many meters ascent has impact to travellling time. (300m could be one addionional hour as rough estimate)
2b: valley, river or other limits: either again raise the weight or remove the link if one cannot directly go from one point to the other. (e.g draw the valley as polygon and remove all links that cross the polygon)
Are there already paths/ forest roads in the wood?
Yes, draw them as links into the modell (graph), to use link weight, e,.g 5km/h walking speed.
Now you have a graph with nodes and the links with hopefully realistic link weight related to travelling speed between nodes.
Now use Shortes path (Dijkstras Algorithm) and travelling salesman algorithm.
If that all is to much work (could be some months for somebody with a degree in computer science) , plan it manually: draw a raster of 1000 x 1000m and let the human intelligence
do its job.
Since 20.000 points which have to be checked by walking, needs a high effort, it is addionally worth to evaluate automatic planning versus human experience. Try both variants and look which is more efficient.
(I think that people with outdoor experience when having a good map with countour lines and the check points on it, will do a better job, asuming preorganizing by point two quadrants asignement and quadrants to people.)
My other soulution:
This asumes you have more info which you did not have yet posted:
You probably have more info than just the coordinates of the points.
Who has created this points? In your graphic, they look as they are on a path.
Are they recorded while driving on that path with a vehicle? Then you have a time stamp, and therefore an order of points that are in sequnce, and thefore already are related to a path.
So the first step would be to assign the points to a path.
(You also could draw all forest paths known as vectors to a digital map and match the points to the neareast path automatically)
You need the paths when you cannot directly reach each node on a straight line in betwwen them (e.g driving by vehicle or walking in wood when river avoids direct straight line path)
Then once you have a graph with nodes on links, use a minimum spaning tree to calculate the sum of path lengths in kilomter.
For visting the points you often will have to return to a branch, so then a travelling salesman algorith will help to give the kilomters needed to visit all nodes.
The question seems to be similar to a constrained vehicle routing problem. You can try a heuristic for example the savings algorithmus: http://neo.lcc.uma.es/vrp/solution-methods/heuristics/savings-algorithms/.

Google Maps: Given a point, how to find all points at a given road distance?

In my app, the GPS picks the location of the vehicle. It is then supposed to put markers at all points where the vehicle could be if it drives for 1 KM in any direction (note that the roads may fork many times within his 1KM reach).
Can someone suggest me how to do this? Thanks in advance.
This is a very tricky problem to solve with the Google Maps API. The following is one method that you may want to consider:
You can easily calculate a bounding circle of 1km around your GPS point, and it is also easy to calculate points that fall on the circumference of this circle, for any angle. This distance will be "as the crow files" and not the actual road distance, but you may want to check out the following Stack Overflow post for a concrete implementation of this:
How to calculate the latlng of a point a certain distance away from another?
Screenshot with markers at 20 degree intervals on a bounding circle with a 1km radius:
removed dead ImageShack link - How to calculate the latlng of a point a certain distance away from another?
There is also a trick to snap these points to the nearest street. You can check out Mike Williams' Snap point to street examples for a good implementation of this.
Calculating the road distance from your GPS point to each snapped road point could be done with the directions service of the Google Maps API. Note that this will only work in countries that support directions in Google Maps, but more importantly, the road distance will almost always be greater than 1km, because our bounding circle has a 1km radius "as the crow flies". However if you can work with approximate information, this may already be one possible solution.
You can also consider starting with the above solution (1km bounding circle, calculate x points on the circumference, and snap them to the closest road), then calculate the road distance of each path (from your GPS point to each snapped point), and then you can repeat this this recursively for each path, each time using a smaller bounding circle, until you reach a road distance close to 1km. You can decrease the bounding circle in each recursion, in proportion to the error margin, to make your algorithm more efficient.
UPDATE:
I found a very neat implementation which appears to be using a similar method to the one I described above:
Driving Radius (Multiple destinations)
Note how you can change the interval of degrees from the top. With a wide interval you'll get fast results, but you could easily miss a few routes.
Screenshot:
removed dead ImageShack link - Driving Radius
Natural brute force algorithm is to build a list of all possible nodes taking into account each possible decision on every crossroad.
I doubt that within 1km you would get more then 10 crossroads on average and assuming avg of 3 choices on a crossroad you would end up with 3^10 - around 59,049 end nodes (notice that you need to have 10 crossroads on every branch of the road to reach the full number).
In reality the number would go down and I would assume getting to the same node by different route would not be uncommon, especially in cities.
This approach would give you an exact answer (providing you have good street map as input). It is potential time, but the n does not seem to be that high, so it might be practical.
Further improvements and optimizations might be possible depending on what do you need these nodes for (or which kind of scenarios you would consider similar enough to prune them).
Elaborating a bit on Daniel's approach above, you want to first find all the point within a straight line radius from your origin. That's your starting set of nodes. Now include ALL edges incident to those nodes and other nodes in your starting set. Now check that the nodes are connected and that there aren't any nodes out there floating around that you can't reach. Now create a "shortest path tree" starting from your vehicle node.
The tree will give you the shortest paths from your starting node to all other nodes. Note that if you start by creating paths at the furthest nodes, any sub-paths are also shortest paths to those nodes along the way. Make sure to label those nodes on sub-paths as you continue so you don't need to compute them. Worst case scenario, you need to develop a shortest path for all nodes, but in practice this should take much less time.
List all possible nodes taking into account each possible decision on every crossroad
(But how to do it automatically?
Use Dijkstra`s algorithm to find closes route to all points.
Visualize data.
(That is a little bit tricky, because there can be an unreachable areas inside reachable area.

Resources