How are multiple origins/destinations supposed to be used in google distance matrix? - google-distancematrix-api

I'm wondering about origins and destinations in distance matrix being plural. According to the documentation you can use several locations as a starting and/or finishing point.
How can several locations be one point?
Looking at other posts it seems as this is a way to find or route between locations? So then the api chooses one of the origin locations as the optimal (shortest distance?) starting point and one of the destinations as the optimal finishing point?

Further down in the documentation it is written that the response will return one row between each origin and destination.
It is written under responses and not under the documentation for origins/destinations which is the reason this question is relevant.

Related

Grouping web site users by routes they make

I’m writing a simple website. I want to be able to group users by routes they make on my sites. For example I have this site tree, but the final product will be more complicated. Lets say I have three users.
User one route is A->B->C->D
User two route is A->B->C->E
User three route is J->K
We can say that users one and two belongs to same group and user three belongs to some other group.
My question is: what algorithm or maybe more than one I should use to accomplish that> Also what data do I need to collect?
I have some ideas, however, I want to confront it with someone who might have more experience than me.
I’m looking for suggestions rather than an exact solution for my problem. Also, if there are any ready-made solution which I can read, I will appreciate it also.
You could also consider common subpaths. A pointer in this direction and general web usage analysis is: http://pdf.aminer.org/000/473/298/improving_the_effectiveness_of_a_web_site_with_web_usage.pdf
As a first cut, it seems reasonable to divide the problem into (1) defining a similarity score for two traces and (2) using the similarity scores to cluster. One possibility for (1) is Levenshtein distance. Two possibilities for (2) are farthest-point clustering and k-modes.

Dividing the world in a thousand or so locations

Background: I want to create a weather service, and since most available APIs limit the number of daily calls, I want to divide the planet in a thousand or so areas.
Obviously, internet users are not uniformly distributed, so the sampling should be finer around densely populated regions.
How should I go about implementing this?
Where can I find data regarding geographical internet user density?
The algorithm will probably be something similar to k-means. However, implementing it on a sphere with oceans may be a bit tricky. Any insight?
Finally, maybe there is a way I can avoid doing all of this?
Very similar to k-mean is the centroidal Voronoi diagram (it is the continuous version of k-means). However, this would produce a uniform tesselation of your sphere that does not account for user density as you wish.
So a similar solution is the same technique but used with a Power Diagram : a Power Diagram is a Voronoi Diagram that accounts for a density (by assigning a weight to each Voronoi seed). Such diagram can be computed using an embedding in a 3D space (instead of 2D) that consists of the first two (x,y) coordinates plus a third one which is the square root of [any large positive constant minus the weight for the given point].
Using that, you can obtain a tesselation of your domain accounting for a user density.
You don't care about internet user density in general. You care about the density of users using your service - and you don't care where those users are, you care where they ask about. So once your site has been going for more than a day you can use the locations people ask about the previous day to work out what the areas should be for the next day.
Dynamic programming on a tree is easy. What I would do for an algorithm is to build a tree of successively more finely divided cells. More cells mean a smaller error, because people get predictions for points closer to them, and you can work out the error, or at least the relative error between more cells and fewer cells. Starting from the bottom up work out the smallest possible total error contributed by each subtree, allowing it to be divided in up to 1,2,3,..N. ways. You can work out the best possible division and smallest possible error for each k=1..N for a node by looking at the smallest possible error you have already calculated for each of its descendants, and working out how best to share out the available k divisions between them.
I would try to avoid doing this by thinking of a different idea. Depending on the way you look at life, there are at least two disadvantages of this:
1) You don't seem to be adding anything to the party. It looks like you are interposing yourself between organizations that actually make weather forecasts and their clients. Organizations lose direct contact with their clients, which might for instance lose them advertising revenue. Customers get a poorer weather forecast.
2) Most sites have legal terms of service, which must clients can ignore without worrying. My guess is that you would be breaking those terms of service, and if your service gets popular enough to be noticed they will be enforced against you.

Finding nearest road using Google Places API:

In the Ruby application which I'm developing, given latitude and longitude of a point, I need to find the nearest road/highway to the point.
Can someone tell me how I can go about this using Google Places API?
I have tried giving a certain radius and finding the roads using the 'types:"route"' parameter but it gives me "Zero results" as the output.
The Google Places API is not designed to return nearby geographical locations, it is designed to return a list of nearby establishments and up to two locality or political type results to to help identify the area you are performing a Place Search request for.
The functionality you are requesting would be possible with the Google Geocoding API by passing the lat, lng to the latlng parameter of the HTTP reverse geocoding request:
http://maps.googleapis.com/maps/api/geocode/json?latlng=41.66547000000001,-87.64157&sensor=false
Old question but we have an answer now. You can use Nearest Roads API by google. It does returns the closest road segment for each point.
I have implemented in one project and it does works. Although you must keep in mind that The points passed do not need to be part of a continuous path otherwise you will find next road(parallel road) which may not be important to you.

Identifying common route segments from GPS tracks

Say I've got a bunch of recorded GPS tracks. Some are from repeated trips over the same route, some are from completely unique routes, and some are distinct routes yet have some segments in common.
Given all this data, I want to:
identify the repeated trips over the same route
identify segments which are shared by multiple routes
I suppose 1 is really a special case of 2.
To give a concrete example: suppose you had daily GPS tracks of a large number of bicycle commuters. It would be interesting to extract from this data the most popular bicycle commuting corridors based on actual riding rather than from the cycling maps that are produced by local governments.
Are there published algorithms for doing this? How do they work? Pointers to papers and/or code greatly appreciated.
You can use 3D histogram find the most visited points on the map. Using that you can derive the most used paths.
Detail: Keep a 2D matrix count and initialize it to 0, X[i,j]=0. For each track, increment X[i,j]s on the path. Once you have processed all the tracks, threshold this matrix to min threshold (what is the minimum number of tracks for it to be a repeated trip?).
Some practical details: Assuming you have set of points through which path goes. You can find the set of points on the path between two such points with http://en.wikipedia.org/wiki/Bresenham%27s_line_algorithm . You might want to draw a "thicker line" to account for the noisy nature of the data.
This seems a very general question that pertains to many GIS problems.
After some search, it seems you would need to apply some algorithms for computing the similarity between any two routes.
Wang et al. (2013) analyzes a number of such measures. However, all these measures are implemented by dynamic programming with time complexity of O(N1N2), where N1 and N2 are the number of points in the two routes.
A more efficient solution is provided by Mariescu-Istodor et al. (2017) in which each route is transformed into a set of cells in a predefined grid system. The paper also defines the measure of inclusion as the amount of one route contained inside the other, which seems to relate to the second point in your question.

Sort POIs by distance from current location

Trover is an awesome app: it shows you a stream of discoveries (POIs) people have uploaded - sorted by the distance from any location you specify (usually your current location). The further you scroll through the feed, the farther away the displayed discoveries are. An indicator tells you quite accurately how far the currently shown discoveries are (see screenshots on Website).
This is different from most other location based apps that deliver their results (POIs) based on fixed regions (e.g. give me all Pizzerias withing a 10km radius) which can be implemented using a single spacial datastructure (or an SQL engine supporting spatial data types). Deliverying the results the way Trover does is considerably harder:
You can query POIs for arbitrary locations. Give Trover a location in the far East of Russia and it will deliver discoveries where the first one is 2000km away and continuously increasing from there.
The result list of POIs is not limited by some spatial range. If you scroll long enough through the feed you will probably see discoveries which are on the other side of the globe.
The above points require a semi-strict ordering of their POIs for any location. The fact that you can scroll down and reload more discoveries implies that they can deliver specific sections of the sorted data (e.g. give me the next 20 discoveries that are at least 100km away from my current location).
It's fast, the fetching and distance indications are instant. The discoveries must be pre-sorted. I don't know how many discoveries they have in their DB but it must be more than what you want to sort ad hoc.
I find these characteristics quite remarkable and wonder how this is implemented. Any suggestions what kind of data-structure, algorithms or caching might be used?
I don't get the question. What do want an answer to?
Edit:
They might use a graph-database where one edge represent the distance between the nodes. That way you can get the distance by the relationships of nearby POIs. You would calculate the distance and create edges to nearby nodes. To get the distance of an arbitrary point you just do a circle-distance calculation, for another node you just add up the edges value as they represent the distance (this is for the case of getting the walking,biking, or car calculation). The adding up might not be the closest way but will give a relative indication which it seems like they use.

Resources