Datastructure for googlemap like application? - data-structures

I am doing a maprouting application. Several people have suggested me, that I do a datastructure where I split the map in a grid. In theory it sounds really good, but I am not to sure because of the bad performance I get when I implement it.
In the worst case you have to draw every road. If you divide the map in a grid, the sum of roads in all the cells in the grid, will be much larger than if you put all roads in a list.(each cell must have more roads than actually needed if a road goes through it).
If I have to zoom in I can see some smartness in using a grid, but if I keep it in a list I can just decrease the numbers of roads each time I zoom in.
As it is now(by using the list) it is not really fast, so I am all for making it faster. But in practice dividing in a grid makes it slower for me.
Any suggestigion for what datastructure I should be using and/or what I might be doing wrong?

See this question for related information:
What algorithms compute directions from point A to point B on a map?
Somebody who writes this kind of software for a living has answered it.
Also for rendering see:
What is the best way to read, represent and render map data?
I'm not quite sure if you're trying to do routing quick or rendering!

If you want it to go quick, you might be better off organizing your roads in to major and minor roads.
Use the list of minor roads to find a route to the nearest major road.
Use the major roads to get you near the destination.
Then go back to the minor roads to complete the route.
Without a split like this, there are a heck of a lot of roads to search, most of which are quite slow routes.

google does not draw each road every time the screen is refreshed. They used pre-drawn tiles of the map. They can redraw them as needed. e.g. when there is a map update. They even use transparent overlays, stacks of tiles to add and remove layers of details.
Very clever, but very simple.
You may want to look at openlayers javascript library. Free and can do just about anything you need to do with a map.
Maptraction JS is also available - its not as complete as OpenLayers

More optimal then using a grid as your spatial data structure, might be a quadtree because it logarithmically breaks down the map. And from studying the source, my guesstimate is that google uses (that or) a similar data structure.
As for getting directions, you might want to look in to hierarchical path finding to approximate the direction at first and to speed up the process; generic path finding algorithms tend to be quite slow at that level of complexity.

Related

When should these methods be used to calculate blob orientation?

In image processing, each of the following methods can be used to get the orientation of a blob region:
Using second order central moments
Using PCA to find the axis
Using distance transform to get the skeleton and axis
Other techniques, like fitting the contour of the region with an ellipse.
When should I consider using a specific method? How do they compare, in terms of accuracy and performance?
I'll give you a vague general answer, and I'm sure others will give you more details. This issue comes up all the time in image processing. There are N ways to solve my problem, which one should I use? The answer is, start with the simplest one that you understand the best. For most people, that's probably 1 or 2 in your example. In most cases, they will be nearly identical and sufficient. If for some reason the techniques don't work on your data, you have now learned for yourself, a case where the techniques fail. Now, you need to start exploring other techniques. This is where the hard work comes in, in being a image processing practitioner. There are no silver bullets, there's a grab bag of techniques that work in specific contexts, which you have to learn and figure out. When you learn this for yourself, you will become god like among your peers.
For this specific example, if your data is roughly ellipsoidal, all these techniques will be similar results. As your data moves away from ellipsoidal, (say spider like) the PCA/Second order moments / contours will start to give poor results. The skeleton approaches become more robust, but mapping a complex skeleton to a single axis / orientation can become a very difficult problem, and may require more apriori knowledge about the blob.

KDTree Splitting

I am currently writing a KDTree for a physics engine (Hobby project).
The KDTree does not contain points.
Instead it contains Axis Aligned bounding boxes which bound the different objects in the environment.
My problem is deciding on how to split the KDTree nodes when they get full.
I am trying 2 methods:
Method1: Always split the node exactly in half on the biggest axis.
This has the advantage of a pretty evenly spaced out tree.
Big disadvantage: If objects are concentrated in small area of the node, redundant sub-divisions will be created. This is because all volumes are split exactly in half.
Method2: Find the area of the node which contains objects. Split the node on the plane which splits that area in half on it's biggest axis. Example - If all objects are concentrated on the bottom of the node then it split length-wise thereby dividing the bottom in two.
This solves the problem with the method above
When indexing something that exists on the same plane (terrain for example), it creates long and narrow nodes. If I am to add some other objects later which are not on the same plane, these elongated nodes provide very poor indexing.
So what I'm looking for here is a better way to split my KD-Tree node.
Considering that this is going to be a physics engine the decision needs to be simple enough to be made in real time.
The "surface area heuristic" (SAH) is considered the best splitting method for building kd-trees, at least within the raytracing community. The idea is to add the plane so that the surface areas of the two child spaces, weighted by the number of objexts in each child, are equal.
A good reference on the subject is Ingo Wald's thesis, in particular chapter 7.3, "High-quality BSP Construction", which explains SAH better than I can.
I can't find a good link at the moment, but you should look around for papers on "binned" SAH, which is an approximation to the true SAH but much faster.
All that being said, bounding-volume hierarchies (BVH) a.k.a. AABB trees, seem to be much more popular than kd-trees these days. Again, Ingo Wald's publication page is a good starting point, probably with the "On fast Construction of SAH based Bounding Volume Hierarchies" paper, although it's been a while since I read it.
The OMPF forums are also a good place to discuss these sorts of things.
Hope that helps. Good luck!
Certainly for a physics engine where the premise is lots of moving geometry, a bvh is probably the better choice, they don't traverse quite as quickly but they are much faster to build, and are much easier to refit/restructure on a frame per frame basis, and offen don't need a complete rebuild, every frame (something that can be done in parallel over a series of frames while the refitted bvh suffices in the meantime, again, refer to wald).
An exception to this in physics could be when you're dealing with entities that have no volume such as particles or photons, the building of the kd tree is simplified by the fact that you don't need to resolve the bounds of the individual primitive. It really depends on the application. A good physics engine should use a balanced combination of spatial acceleration structures, it's common practise to resolve broader phase partitioning with say a shallow octree then extend the leaf nodes with another scheme that better fits the nature of what you are doing, BSPs are ideal for static geometry, especially in 2d and when the structure isn't changing, the best thing to do is experiment with as many different schemes and structures and get a feel for how and when they work best.

Algorithm for schematizing (metro) maps

This is a long shot, but I thought I might try before starting the dirty work.
I've got a project to build an application which will, for a defined input stations (vertices) and lines (edges), that is, a real map of some public transportation, schematize a given map into a metro map. I've done some research on the problem and it's an NP-complete problem equivalent to the 3-SAT problem. I also have some theoretic ideas on how to generate such a map, but they aren't detailed enough.
What I'm looking for is any other existing solution of this problem, some sort of pseudo-code, some real code in (almost) any other programming language etc, anything that would reduce the time I need to spend working on the algorithm itself, which will in return give me more time to work on other aspects of the application.
If anyone has ever seen anything that can help me, I'd appreciate it very much.
If you google for "metro map layout problem" and "metro map line crossing" you'll find a lot of references, since it has been researched very actively in the past 10 years.
The problem seems no trivial at all, and translating the "artistic" features to mathematical constraints is seemingly one of the most difficult tasks.
Anyway here are three publications that I found interesting to start with (among many, many others):
Metro Map Layout Using Multicriteria Optimization
Line Crossing Minimization on Metro Maps
The Metro Map Layout Problem
HTH!
Research that's similar to your topic: http://graphics.stanford.edu/papers/routemaps/
This is just some suggestion with handwaving - take with a pinch of salt.
My notion of a "metro" map is one where lines tend to one of the eight cardinal directions and stations are regularly spaced.
I'm assuming you're trying to convert a set of real coordinates into "metro" coordinates.
I would start with your main route (e.g., a city loop), then incrementally add other routes in order of importance.
For each route you want to find the nearest approximation that uses the fewest number of straight lines travelling in the eight cardinal directions. You might do this by starting with the bounding box for the real coordinates, splitting that into a grid, then finding a "metro" route from grid square to grid square, then successively refining that route to reduce the number of bends without distorting the map too much and without introducing crossings with other routes if at all possible.
Having done that, scale each line so that consecutive stations are the same distance apart on the "metro" view.
My guess is you'll still want to support manual tweaking of the result.
Good luck!
Feels like a planning problem.
Looks like your hard constraints are:
Every station must be on a point. A points are on a grid with a distance of X between points (I'd make this static on 2cm)
There should not be 2 stations on the same spot
There should be enough room to draw the station label. Note that the label can be assigned different directions from the point to which the station is assigned.
There should be enough room to draw the subway lines.
Looks like your soft constraints are:
For each station, minimize the actually geographical location distance to the point assigned to the station.
Then throw something like Drools Planner on it, here's an example of hard and soft constraints for nurse rostering.

Ask for resource about fast ray-tracing algorithm

First, I am sorry for this rough question, but I don't want to introduce too much details, so I just ask for related resource like articles, libraries or tips.
My program need to do intensive computation of ray-triangle intersection (there are millions of rays and triangles), and my goal is to make it as fast as I can.
What I have done is:
Use the fastest ray-triangle algorithm that I know.
Use Octree.(From Game Programming Gem 1, 4.10. 4.11)
Use An Efficient and Robust Ray–Box Intersection Algorithm which is used in octree algorithm.
It is faster than before I applied those better algorithms, but I believe it could be faster, Could you please shed lights on any possible places that could make it faster?
Thanks.
The place to ask these questions is ompf2.com. A forum with topics about realtime (although also non-realtime) raytracing
OMPF forum is the right place for this question, but since I'm here today...
Don't use a ray/box intersection for OctTree traversal. You may use it for the root node of the tree, but that's it. Once you know the distance to the entry and exit of the root box, you can calculate the distances to the x,y, and z partition planes - the planes that subdivide the box. If the distance to front and back are f and b respectively then you can determine which child nodes of the box are hit by analyzing f,b,x,y,z distances. You can also determine the order to traverse the child nodes and completely reject many of them.
At most 4 of the children can be hit since the ray starts in one octant and only changes octants when it crosses one of the 3 partition planes.
Also, since it becomes recursive you'll be needing the entry and exit distances for the child nodes. These distances are chosen from the set (f,b,x,y,z) which you've already computed.
I have been optimizing this for a very long time, and can safely say you have about an order of magnitude performance still on the table for trees many levels deep. I started right where you are now.
There are several optimizations you can do, but all of them depend on the exact domain of your problem. As far as general algorithms go, you are on the right track. Depending on the domain, you could:
Introduce a portal system
Move the calculations to a GPU and take advantage of parallel computation
A quite popular trend in raytracing recently is Bounding Volume Hierarchies
You've already gotten a good start using a spatial sort coupled with fast intersection algorithms. For tracing single rays at a time, one of the best structures out there (for static scenes) is a K-d tree built using the Surface Area Heuristic.
However, for truly high-speed ray tracing you need to take advantage of:
Coherent packets of rays
Frusta
SIMD
I would suggest you start with "Ray Tracing Animated Scenes using Coherent Grid Traversal". It gives an easy-to-follow example of such a modern approach. You can also follow the references to see how these ideas are applied to K-d trees and BVHs.
On the same page, also check out "State of the Art in Ray Tracing Animated Scenes".
Another great set of resources are all the SIGGRAPH publications over the years. This is a very competitive conference, so these papers tend to be top-notch.
Finally, if you're willing to use existing code, check out the project page for OpenRT.
A useful resource I've seen is the journal of graphics tools. Depending on your scenes, another BVH might be more appropriate than an octree.
Also, if you haven't looked at your performance with a profiler then you should. Shark is great on OSX, and I've gotten good results with Very Sleepy on windows.

anything better than bounding boxes?

I have a scenario, where I have x million longitude latitude points.
When a new long/lat point is added I want to know efficiently which other points are within a user configured distance parameter, so I can add them to a list.
got anything better than bounding boxes?
I would love to see algorithms, references and a few implementations ;) thank you kindly!
There are quite a few options that are better, mostly based around space partitioning.
A common, and often very good option (which isn't too tough to implement) is to use a KD-Tree. Quadtrees are easier to implement, but slower for searching. Depending on the distribution of your data, and your requirements, other space partitioning algorithms may perform better, have lower memory requirements, or other issues that are related.
A colleague told me that he had good experience with using Morton-Code as a spatial index on GIS data, maybe that is something worth investigating.
This quick-and-dirty approach may save you some grief: Divide the surface of the earth into 1 degree boxes. You will then have a 180x360 element array and you will only need to search a small number of boxes, including the box containing the new point and all the boxes immediately around it for which one of the corners is within the user-specified distance. You will find that there are some tricks you can use to quickly figure out what boxes to use without considering them all. Just don't forget latitude and longitude wrap-around.
If your "only" have millions of points, and they aren't clustered into hot-spots, that might get you through.
A theoretically superior way: You could map each point into three dimensional space and then store them in an octree, which would let you quickly find nearby points to within an arbitrary distance. Of course, the distance in three-dimensional space will be slightly different than the great-circle distance on the globe, so you will have to calculate a conversion factor. That should be simple, though. You don't mention an implementation language, but there is almost certainly going to be a well-tested octree implementation for any language you are working in. If you don't mind inserting the third-party code, this solution is the way to go.

Resources