Searching geocoded information by distance - algorithm

I have a database of addresses, all geocoded.
What is the best way to find all addresses in our database within a certain radius of a given lat, lng?
In other words a user enters (lat, lng) of a location and we return all records from our database that are within 10, 20, 50 ... etc. miles of the given location.
It doesn't have to be very precise.
I'm using MySQL DB as the back end.

There are Spatial extensions available for MySQL 5 - an entry page to the documentation is here:
http://dev.mysql.com/doc/refman/5.0/en/spatial-extensions.html
There are lots of details of how to accomplish what you are asking, depending upon how your spatial data is represented in the DB.
Another option is to make a function for calculating the distance using the Haversine formula mentioned already. The math behind it can be found here:
www.movable-type.co.uk/scripts/latlong.html
Hopefully this helps.

You didn't mention your database but in SQL Server 2008 it is as easy as this when you use the geography data types
This will find all zipcodes within 20 miles from zipcode 10028
SELECT h.*
FROM zipcodes g
JOIN zipcodes h ON g.zipcode <> h.zipcode
AND g.zipcode = '10028'
AND h.zipcode <> '10028'
WHERE g.GeogCol1.STDistance(h.GeogCol1)<=(20 * 1609.344)
see also here SQL Server 2008 Proximity Search With The Geography Data Type
The SQL Server 2000 version is here: SQL Server Zipcode Latitude/Longitude proximity distance search

This is a typical spatial search problem.
1> what db are you using, sql2008, oracle, ESRI geodatabase, and postgis are some spatial db engine which has this functionaliyt.
2> Otherwise, you probably look for some spatial Algo library if you want to achieve this. You could code for yourself, but I won't suggest because computation geometry is a complicated issue.

If you're using a database which supports spatial types, you can build the query directly, and the database will handle it. PostgreSQL, Oracle, and the latest MS SQL all support this, as do some others.
If not, and precision isn't an issue, you can do a search in a box instead of by radius, as this will be very fast. Otherwise, things get complicated, as the actual conversion from lat-long -> distances needs to happen in a projected space (since the distances change in different areas of the planet), and life gets quite a bit nastier.

I don't remember the equation off the top of my head, but the Haversine formula is what is used to calculate distances between two points on the Earth. You may Google the equation and see if that gives you any ideas. Sorry, I know this isn't much help, but maybe it will give a place to start.

If it doesn't have to be very accurate, and I assume you have an x and y column in your table, then just select all rows in a big bounding rectangle, and use pythagorus (or Haversine) to trim off the results in the corners.
eg. select * from locations where (x between xpos-10 miles and xpos+10miles) and (y between xpos -10miles and ypos+10miles).
Remember pythagorus is sqrt(x_dist^2 + y_dist^2).
Its quick and simple, easy to understand and doesn't need funny joins.

Related

Oracle SQL RDBMS tables data to mind map node structure

I have done a lot of searching on this but not sure what my approach should be and looking for community advice. I would like to plot the graphical points of dependent data. The data already resides in a RDBMS and I have made some layered queries to plot the points and adapted to a node tree structure in Tableau. I have gotten to the second level, but it is quite convoluted and inefficient to continue down that path.
I am surprised a hierarchical db would be required for this. It's mostly algebraic logic.
Heres the concept:
Adapted from (and credence too) Data + Science: Node-Link Tree Diagram in Tableau
I took this data and queried against a table with the t value. Then I applied the Sigmoid function in Tableau to create the connecting lines.
Here are my results so far.
Note the plot points are based off a 3rd level not shown which has 71 data points to it. This is why Level 1 y points are 36, 18, -18, 36. Level 2 is just plotted badly due to the algebra.
The goal is to get POSITION_1 & POSITION_2 to dynamically plot so that any changes in the tables automatically update the final result, ie. a living node-tree based off the RDBMS data.
Is my approach way off kilter, or is there a better way?
BTW, it's not to be blown to millions of nodes, just to probably 5 levels, somewhere in the hundreds.

How to quickly determine which longitude & latitude range a point is in

Suppose I have some (lng, lat) coordinate. I also have a big list of ranges,
[ { northeast: {lng, lat}, southwest: {lng, lat} } ... ]
How can I most effeciently determine which bucket the (lng, lat) point goes into?
Also, on a design perspective. Would it make more sense for the "list of ranges" to be on some database like mysql, monodb, or on something like memcached, redis?
Thank you.
An SQL database might be a good answer. If you imagine a table like (bucketId, latNe, longNe, latSw, longSw), with indices on all the lat/long columns, then you could very efficiently get an answer by preparing and executing a query like SELECT bucketId FROM bucketTable WHERE latNe > ? AND longNe < ? AND latSe < ? AND longSe > ? using the desired lat/long coordinate.
You need to subdivide the list of ranges. You can look into a quadkey. It's similar to a quadtree. It uses a morton curve. You can very fast compute the quadkey of the range and the points. But you can also try a rectangle tree. You can also use an intervall tree.
The R-Tree is an data structure designed for this kind of thing. Boost contains an implementation of it. As does CGAL. Most modern databases support this kind of thing natively as well.

MongoDB geospacial query

I use mongo's "$near" query, it works as expected and saves me a lot of time.
Now I need to perform something more complicated. Imagine, we have a collection of "checkins" (let's use foursquare notation), that contains the geospacial information (nothing unusual: just lat and lng) and time. Given the checkins by two people, how do I find their "were near to each other" checkins? I mean, e.g.: "1/23/12 you've been 100 meters away"
The easiest solution is to select all the checkins by the first user and find nearest checkin for each first user's checkin on the framework side (I use ruby). But is it the most efficient solution?
Do you have better ideaas? May be I need some kind of a special index?
Best,
Roman
The MongoDB GeoSpatial indexes provide two types of queries: $near and $within. The $near query returns all points in the database that are within a certain range of a requested point, while the $within query lists all points in the database that are inside of a particular area (box, circle, or arbitrary polygon).
MongoDB does not currently provide a query that will return all points that are within a certain distance of any member of another set of points, which is what you seem to want.
You could conceivably use the point data from user1 to build a polygon describing the "area of interest" and then use the $within query to see if there were any checkins by other people inside of that area. If you use a compound index on location & date, you could even restrict the query to folks who were inside of that area on a particular day.
References:
http://docs.mongodb.org/manual/core/indexes/#geospatial-indexes
http://docs.mongodb.org/manual/reference/operators/#geospatial

ObjectMapper: Find geo places within a certain square, sorted by proximity

I am building a Ruby app on Heroku using Sinatra and a PostgreSQL database interfaced with ObjectMapper. I need to run a query which returns a list of all locations in a database (which each have latitude and longitude attributes) within a certain rectangle (corresponding to the visible map region).
I can do this by searching for latitudes which fall within the map bounds, same for longitude. My question however is, how do I return these results sorted by proximity? I could get all results matching the query and then sort them once they are out of the database, but I want to run this query in batches and return only say the nearest 5 places, then places 6-10, then 11-15, etc.
Can this be done?
EDIT: I have not decided yet whether to use PostgreSQL for sure, I might use MongoDB if it is appropriate.
The immediate question is proximity to what? You need to define a point to use as the basis for the proximity. You can then use st_distance from the ORDER BY clause to sort by distance between the geometry objects. This can be combined with LIMIT and OFFSET to do exactly what you want.

How to show nearest store based on zipcode

I am creating an app tha is for a a bussines that has several stores around the state.
How can i show the information for the nearest sores based in the zip code?
Thanks for any help
The basic idea is:
Convert the ZIP code to geographical coordinates (longitude and latitude).
Compute the distance of each store to this coordinate.
Order the results by distance, ascending.
Step 2 can be optimized a bit -- for example, you might limit the search to those stores in the same state. You may also want to limit the number of stores returned if you are only going to display 10, for example.
This is about all the detail I can provide since your question is quite general.

Resources