Probability based on euclidean distance - probability

First of all sorry for the ambiguous title.
I am working on a genetic algorithm for Multiple Depot Vehicle Routing Problem.
I am creating candidate solution based on the distances from customers to depots. I have created a method that creates a probability for each depot of serving each customer. Pseudocode below:
for each customer
for each depot
calculate euclidean distance between customer and depot
get the maximum distance
for each depot
totalDistance = totalDistance + (maximumDistance - currentDepotDistance)
for each depot
depotProbability = (maximumDistance - currentDepotDistance) / totalDistance
The results are the following:
While this formula works I would like to be able to somehow increase or decrease the probabilities in order to find an appropriate ratio. I would like to be able to move from the point where the closest depot is always chosen to the point where depots are randomly assigned.
EDIT
Results after implementing the algorithm in the accepted answer:
T=0.1 Closest depot to every customer
T=20 Other routes taken into account

You may want to try something like softmax action selection:
Where d is each depot and τ is the "temperature" parameter. When τ → 0, your selection turns into greedy selection (always the smallest distance). When τ → ∞, your selection becomes random.

Related

Group locations on map based on distance

I've given the following problem. A set of locations (e.x. around 200 soccer clubs) is spread over a map. I want to group the locations based on their distance to each other. The result should be a list of groups (around 10 to 20) so that the distance each soccer club has to drive to visit all other clubs within their group is minimized.
I'm pretty sure an algorithm exists already. I probably only need the "official" name of this problem.
Can anyone please help me ?
You're probably looking for Data Clustering Algorithms. Since you have an idea of the number of clusters, a simple algorithm is k-means clustering.
If you want to choose the maximum distance d at the outset (and then determine how few groups suffice to guarantee that no team needs to drive more than this distance to get to another team in their own group) then you can formulate the problem as a graph colouring problem: make a vertex for each team, and put an edge between two vertices whenever the distance between them exceeds d. The solution to a graph colouring problem assigns a "colour" (just a label) to each vertex so that (a) no two vertices connected by an edge are assigned the same colour and (b) the number of distinct colours used is minimal. (In other words, edges represent "conflicts", indicating that the two endpoints cannot belong to the same group.) So here, each colour corresponds to a group, which is guaranteed to consist only of teams that are all <= d from each other, and the solution will try to minimise the total number of groups. You might need to rerun with a few different values of d until you get a solution with acceptably few groups.
Note that this is an NP-hard problem, so it might take a long time to find an exact (minimal-group-count) solution. There are many heuristics that are much quicker and still do a decent job, though.

How can I better place cities around procedurally generated terrain?

I'm making a terrain/map generator. The terrain generation works fine, but a problem arises when placing cities around the terrain. I place cities on given terrain depending on the terrain's moisture, temperature, soil quality, and underground mineral prevalence. Naturally, the optimal terrain will have the next-to-optimal terrain very nearby, so my cities appear in clusters, like so (cities are red dots):
The solution I'm leaning towards is identifying if two cities are within a certain defined distance of each other, and if they are, move one of the cities to a location clear of other cities. My questions are:
Is that the best approach for spreading out cities?
If so, what technique can I use to determine how close two cities are (I read about nearest neighbor searches and it looked very promising, maybe that)?
I would lower the constrains for possible city-locations. Calculate a probability for each position for a city.
Than set a city by random according to the probabilities of all locations not yet occupied by cities.
Than lower the probabilities around the new set city (within a certain radius, Near the city lower the probabilities a lot (or even set them ti 0) and only lower then less with increasing distance.
This way the cities should be spread around your terrain quite naturally.
Example (very little one) How to take the next location:
You have 3 possible locations with probabilities 0.9, 0,2 and 0.4
Now sum up the probabilities 0.9 + 0.2 + 0.4 = 1.5
now take a random number between 0 and 1.5.
0 .. 0.9 take the first location
0.9 .. 1.1 take the second location
1.1 .. 1.5 take the third location
Two approaches:
If you have only few cities, whenever you are considering adding a new one, just calculate the distance from the new site to the existing cities, and do not add the city if the distance is lower than a threshold you set.
If you have lots of cities, allocate a mask of the same size as the original map grid, and whenever you add a city, mark all points near that city as "occupied" meaning that you can't add a new city to those. Basically, you set the suitability of a terrain point near existing city to zero by marking that in the mask. Create a disk shaped mask for best results. If you want, you can use a floodfill starting from the city then to fill the disk mask only through land routes, so that the influence of a city stops to one side of a river, say. (Think of Buda and Pest).

Marauders dilemma algorithm

I'm making this repost after the earlier one here with more details.
PROBLEM :
The problem consists of a marauder who has to travel to different cities spread over a map. The starting location is known. Each city has a fixed loot associated with it. The aim of marauder is to travel across various nature of terrain. By nature of terrain, I mean there is a varied cost of travel between each pair of cities. He has to maximize the booty gained.
What we have done:
We have generated an adjacancy matrix (booty-path cost in place for each node) and then employed a heuristic analysis. It gave some output which is reasonable.
Now, the problem now is that each city has few or more vehicles in them, which can be bought (by paying) and can be used to travel. What vehicle does in actual is that it reduces the path cost. Once a vehicle is bought, it remains upto the time when next vehicle is bought. It is to upto to decide whether to buy the vehicle or not and how.
I need help at this point. How to integrate the idea of vehicle into what we already have? Plus, any further ideas which may help us to maximize the profit. I can post the code, if required. Thanks!
One way to do it would be to have a directed edge bearing the cost of the vehicle towards a duplicate graph with the reduced costs. You can even make it so that the reduction is finer than just a percentage if you want to.
The downside is that this will probably increase the size of the graph a lot (as many copies as you have different vehicles, plus the links between them), and if your heuristic is not optimal, you may have to modify it so that it considers the new edge positively.
It sounds as though beam search would suit this problem. Beam search uses a heuristic function H and a parameter k and works like this:
Initialize the set S to the initial game position.
Set T to the empty set.
For each game position in S, generate all possible successor positions to S after one move by the marauder. (A move being to loot, to purchase a vehicle, to move to an adjacent city, or whatever else a marauder can do.) Add each such successor position to the set T.
For each position p in T, evaluate H(p) for a heuristic function H. (The heuristic function can take into account the amount of loot, the possession of a vehicle, the number of remaining unlooted cities, and whatever else you think is relevant and easy to compute.)
If you've run out of search time, return the best-scoring position in T.
Otherwise, set S to the best-scoring k positions in T and go back to step 2.
The algorithm works well if you store T in the form of a heap with k elements.

Understanding Cormen post office location solution

The book "Introduction to algorithms" by Cormen has a question post office location problem in chap 9.
We are given n points p1,p2,...pn with weights w1,w2,....wn. Find a point p(not necessarily one of the input points) that minimizes the sum wi*d(pi,p) where d(a,b) = distance between points a,b.
Looking at the solution to the same , I understand that the weighed median would be the best solution for this problem.
But I have some fundamental doubts about the actual coding part and the usage.
If all elements have equal weight , then to find the weighed median, we find the point till which summation of all weights < 1/2. How to extend it here ?
Given a real scenario having say the number of letters to be delivered at various houses as the weights and we want to minimize the distance to be traveled by finding the location of the post office, x coordinates given ( assuming all houses are in 1 single dimension) , how would we actually go about it ?
Could someone help me in clearing my doubts and understanding the problem.
EDIT :
I was also thinking about a very similar problem : There is a rectangular grid(2d) and different number of people at various places and all want to meet at 1 point (should definitely have integer coordinates) , then what difference would be there from the above problem and how would we solve it ?
You still want the point at which the weights sum to 1/2. Pick any point and consider whether you would do better moving one point to the left or one point to the right from there. If you move left one point you reduce the distance to all points on the left by one and increase the distance to all points on the right by one. Whether you win or lose by this depends on the sum of the weights to the left and the sum of the weights to the right. If the weights do not sum to 1/2 you can do better by moving in the direction that has weight > 1/2, so the only point where you can't do better by choosing another one is the point with weight 1/2 - or, to be more accurate, the point where the weights on either side are both <= 1/2.
For 1/2 to be the right answer the weights have to sum to 1, so if you start off with weights which are numbers of letters then you have to divide them by the total number of letters to get them to sum by one. Of course this penalty function doesn't really make sense unless you have to make a separate trip for each letter to be delivered, but I'm assuming that we are supposed to ignore that.
EDIT
For more than one dimension, you pretty much end up solving the problem of minimising the weighted sum of distances directly. Wikipedia describes this in http://en.wikipedia.org/wiki/Geometric_median. You want to take weights into account, but that doesn't complicate the problem that much. One way of doing it is http://en.wikipedia.org/wiki/Iteratively_reweighted_least_squares. Unfortunately, that doesn't guarantee that the solution it finds will be on a grid point, or that the nearest grid point to a solution will be the best grid point. It probably won't be too bad, but finding the very best grid point in all possible cases might be trickier.
EDIT : this answer is wrong, see comments
What you're looking for is called center of mass (TMHO the weighted median is the center of mass in one dimension).
I didn't get you first question can you detail.
For your example, we would compute the average position weighted by the number of letter per office linked with this position. This would give us : x_center = sum(x_i * w_i) / sum(w_i) and y_center = sum(y_i * w_i) / sum(w_i).
Did it correctly answer your problem ?

Algorithm to find point of minimum total distance from locations

I'm building an application based around finding a "convenient meeting point" given a set of locations.
Currently I'm defining "convenient" as "minimising the total travel distance". This is a different problem from finding the centroid as illustrated by the following example (using cartesian coordinates rather than latitude and longitude for convenience):
A is at (0,0)
B is at (0,0)
C is at (0,12)
The location of minimum total travel for these points is at (0,0) with total travel distance of 12; the centroid is at (0,4) with total travel distance of 16 (4 + 4 + 8).
If the location were confined to being at one of the points, the problem appears to become simpler, but this isn't a constraint I intend to have (unlike, for example, this otherwise similar question).
What I can't seem to do is come up with any sort of algorithm to solve this - suggestions welcomed please!
Here is a solution that finds the geographical midpoint and then iteratively explores nearby positions to adjust that towards the minimum total distance point.
http://www.geomidpoint.com/calculation.html
This question is also quite similar to
Minimum Sum of All Travel Times
Here is a wikipedia article on the general problem you're trying to solve:
http://en.wikipedia.org/wiki/Geometric_median
In a way what you appear to be looking for is the center of mass of a triangle with equal weights at the vertices. That would point to barycentric coordinates.
When going beyond a triangle there are solutions for generalized barycentric coordinates and you could give priorities to persons by modifying the weight of the vertices. What that still would not account for is distances on a real map (can't just travel straight in any direction) but it may be a start?
One option is to define an objective (and gradient) function and use a generic optimization library, such as scipy.optimize. fmin_cg would be a good algorithm to try for your problem. Your objective would be the sum of distances as defined in the "Definition" section of the Geometric median Wikipedia page referenced by hatchet. The argument to your objective function is y.

Resources