Hill climbing in an n-dimensional space: finding the neighbor

Hill climbing in an n-dimensional space: finding the neighbor - algorithm

In hill climbing for 1 dimension, I try two neighbors - a small delta to the left and one to the right of my current point, and then keep the one that gives a higher value of the objective function. How do I extend it to an n-dimensional space? How does one define a neighbor for an n-dimensional space? Do I have to try 2^n neighbors (a delta applied to each of the dimension)?

You don't need to compare each pair of neighbors, you need to compute a set of neighbors, e.g. on a circle (sphere/ hypersphere in a higher dimensions) with a radius of delta, and then take the one with the highest values to "climb up". In any case you will discretize the neighborhood of your current solution and compute the score function for each neighbor. When you can differentiate your function, than, Gradient ascent/descent based algorithms may solve your problem:
1) Compute the gradient (direction of steepest ascent)
2) Go a small step into the direction of the gradient
3) Stop if solution does not change
A common problem with those algorithms is, that you often only find local maxima / minima. You can find a great overview on gradient descent/ascent algorithms here: http://sebastianruder.com/optimizing-gradient-descent/

If you are using IEEE-754 floating point numbers then the obvious answer is something like (2^52*(log_2(delta)+1023))^(n-1)+1 if delta>=2^(-1022) (more or less depending on your search space...) as that is the only way you can be certain that there are no more neighboring solutions with a distance of delta.
Even assuming you instead take a random fixed size sample of all points within a given distance of delta, lets say delta=.1, you would still have the problem that if the distance from the local optimum was .0001 the probability of finding an improvement in just 1 dimension would be less than .0001/.1/2=0.05% so you would need to take more and more random samples as you get closer to the local optimum (of which you don't know the value...).
Obviously hill climbing is not intended for the real number space or theoretical graph spaces with infinite degree. You should instead be using a global search algorithm.

One example of a multidimensional search algorithm which needs only O(n) neighbours instead of O(2^n) neighbours is the Torczon simplex method described in Multidirectional search: A direct search algorithm for parallel machines (1989). I chose this over the more widely known Nelder-Mead method because the Torczon simplex method has a convergence proof (convergence to a local optimum given some reasonable conditions).

Related

Is Manhattan distance still an admissible heuristic in this modified n-puzzle?

I successfully implemented an 8-puzzle solver using the A* algorithm and now I am adding a twist to it: there could be more than one empty space in the puzzle and the numbers on the tiles are no longer unique (there could be numbers that are the same).
While the algorithm works after I modified it to generate successor states for all empty spaces, it didn't solve the game in the smallest number of moves possible (because I actually came up with a smaller number of moves when I tried to solve it by hand, surprise!)
Question: Is Manhattan distance still a viable heuristic in this puzzle? If not, what could the heuristic be?

Yes, an admissible heuristic for this problem can involve Manhattan distance.
The simplest approach is just to take the Manhattan distance to the closest possible target location for each tile.
This is clearly admissible because it's impossible to take less moves to get to any location quicker than directly moving to the closest one with ignoring all obstacles.
But we can do better - for two identical tiles A and B with target positions 1 and 2, rather than calculating the distance to the closest one for each, we can calculate the distance of all possible assignments of tiles to positions, so:
min(dist(A,1) + dist(B,2), dist(A,2) + dist(B,1))
This can be generalized to any number of tiles, but keep in mind that, for n identical tiles, there are n! such possibilities, so it gets quite expensive to calculate quite quickly.
Seeing why this is admissible is still fairly easy - since we're calculating the shortest possible distance for all assignments of tiles to positions, there's no way that the actual shortest distance could be any less.

Find member of set with minimizing total distance property

I'm looking for efficient solution of following problem: For given set of points in n-dimensional euclidian space find such member of this set that minimizes total distance to other points in set.
The obvious naïve approach is quadratic, so I'm looking for something less than quadratic.
My first thought was that all I need is just to find the center of bounding sphere and then, find the closest point in set to this point. But this is actually not true, imagine right triangle - all its vertices are equidistant from such center, nevertheless, exactly one vertice meets our requirements.
It would be nice it one will shed some light on this issue.

What minimizes the distance to all of the points is their average. Only a guess, but after you'll find the average you could find a point closest to it. As correctly pointed out in comments, median instead of average will actually minimize the distance (average will minimize squared distance). Median can also be calculated in O(n). For high dimensional datasets this solution would be O(n*m) of course, where m is the number of dimensions.
Also some links:
See accepted answer here: Algorithm to find point of minimum total distance from locations
And link provided by mcdowella: http://en.wikipedia.org/wiki/Geometric_median

I am making this up as I go along, but there appears to be a close connection between "best point of a set" and "best point" in convex optimization.
Your score function is a sum of distances. Each distance is convex U-shaped (OK V-shaped in this case) so their sum is convex U-shaped. In particular it has a perfectly good derivative everywhere except at points in the set, and this derivative is optimistic - if you take the value at a point and its derivative, neglecting any point at the point you are looking at, then predictions based on this will be optimistic - the line formed using the derivative lies almost entirely beneath the correct answer but grazes it at a single point.
This leads to the following algorithm:
Repeatedly
Pick a point at random and look to see if is the best point so far. If so, take note of it. Take the derivative of the sum of distances at this point. Use this, and the value at that point, to work out the predicted sum of distances at every other point and discard the points where this prediction is worse than the best answer so far as possible answers (although you still need to take them into account when working out distances and derivatives). These will be the points on the far side of a plane drawn through the chosen point normal to the derivative.
Now discard the chosen point as a contender as well and repeat if there are any points left to consider.
I would expect this to be something like n log n on randomly chosen points. However, if the set of points form the vertices of a regular polygon in n dimensions then it will cost N^2, discarding only the chosen point each time - any of the N points is in fact a correct answer and they all have the same sum of distances from each other.
I will of course up-vote anybody who can confirm or deny this general principle for finding the best of a set of given points under a convex objective function.
OK - I was interested enough in this to program this up - so I have 200+ lines of Java to dump in here if anybody cares. In 2 dimensions it's very fast, but at 20 dimensions you gain only a factor of two or so - this is reasonably understandable - each iteration cuts off points by projecting the problem down to a line and chopping off a fraction of the points outside the line. A randomly chosen point will be about half as far away from the centre as the other points - and very roughly you can expect the projection to cut off all but some multiple of the d-th root of 1/2 so as d increases the fraction of points you can discard in each iteration reduces.

What are some good methods to finding a heuristic for the A* algorithm?

You have a map of square tiles where you can move in any of the 8 directions. Given that you have function called cost(tile1, tile2) which tells you the cost of moving from one adjacent tile to another, how do you find a heuristic function h(y, goal) that is both admissible and consistent? Can a method for finding the heuristic be generalized given this setting, or would it be vary differently depending on the cost function?

Amit's tutorial is one of the best I've seen on A* (Amit's page). You should find some very useful hint about heuristics on this page .
Here is the quote about your problem :
On a square grid that allows 8 directions of movement, use Diagonal distance (L∞).

It depends on the cost function.
There are a couple of common heuristics, such as Euclidean distance (the absolute distance between two tiles on a 2d plane) and Manhattan distance (the sum of the absolute x and y deltas). But these assume that the actual cost is never less than a certain amount. Manhattan distance is ruled out if your agent can efficiently move diagonally (i.e. the cost of moving to a diagonal is less than 2). Euclidean distance is ruled out if the cost of moving to a neighbouring tile is less than the absolute distance of that move (e.g. maybe if the adjacent tile was "downhill" from this one).
Edit
Regardless of your cost function, you always have an admissable and consistent heuristic in h(t1, t2) = -∞. It's just not a good one.

Yes, the heuristic is dependent on the cost function, in a couple of ways. First, it must be in the same units. Second, you can't have a lower-cost path through actual nodes than the cost of the heuristic.
In the real world, used for things like navigation on a road network, your heuristic might be "the time a car would take on a direct path at 1.5x the speed limit." The cost for each road segment would use the actual speed limit, which will give a higher cost.
So, what is your cost function between tiles? Is it based on physical properties, or defined outside of your graph?

Approximate Estimation of Distance Matrices

I have a set of N objects, and I'd like to compute a NxN distance matrix. Sometimes my set of N objects is very large, and I'd like to compute an approximation to the NxN distance matrix by only computing a subset of the distance comparisons.
Can anyone point me in the direction of something that calculates approximations to a full distance matrix? I have some ideas in mind, but I'd like to avoid re-inventing the wheel.
Edit: An example of the type of algorithm would take advantage of the fact that if there is a very small distance between object A and object B, and there is a very small distance between object B and object C, there has to be a somewhat short distance between objects A and C.

I had this same question and ended up writing Python code for it:
https://github.com/jpeterbaker/lazyDistance
README.md explains how the triangle inequality can be used to update upper and lower bounds for each distance.
Just run the Python file as a script for an example in 2-dimensional space. The plotted lines are the only distances that were actually calculated.
In my version, the time savings aren't about having a large number of objects. As I've written it, it's a O(n^4) algorithm, so it's actually worse than just calculating all distances if the number of objects is large. But my method will save time when you have a modest number of objects and the distance function is very expensive to calculate. It assumes that it is faster to do several O(n^2) operations rather than a single distance measurement.
If n is large, you could look for cheaper methods to decide which distance to calculate next (that don't involve arithmetic with n^2 entries of distance bounds matrices). You also may not need to update all 2*n^2 bounds every time that this code does.

Honestly, I think it depends how close you want your approximation to be and how big your subset is. If you just want some overall feel of what the matrix will look like, you can do simple linear interpolation on a random subset (including the maximal and minimal nodes) getting pretty accurate (tm) results.
I think the real trick here is figuring out the heuristic (linear, quadratic, etc interpolation) and the subset size. You could also figure out the distance matrices of various subsets and then interpolate those matrices with some method (linear, spherical linear, cubic).
Depending on your initial sample, it's pretty much an heuristic trial and error until you go "oh that's good enough for what I need".

Are your "objects" on a network? If the objects are in a network, you can use this or this that yields the all-pairs shortest paths. If not, you're pretty much stuck with calculated all the n x n distances, I think.

The solution you require is similar to what we commonly see in a graph, you can use All pair shortest path for finding the distance, you can also look at johnson's algorithm

Need Better Algorithm for Finding Mapping Between 2 Sets of Points with Minimum Distance

Problem: I have two overlapping 2D shapes, A and B, each shape having the same number of pixels, but differing in shape. Some portion of the shapes are overlapping, and there are some pieces of each that are not overlapping. My goal is to move all the non-overlapping pixels in shape A to the non-overlapping pixels in shape B. Since the number of pixels in each shape is the same, I should be able to find a 1-to-1 mapping of pixels. The restriction is that I want to find the mapping that minimizes the total distance traveled by all the pixels that moved.
Brute Force: The brute force approach to solving this problem is obviously out of the question, since I would have to compute the total distance of all possible mappings of which I think there are n! (where n is the number of non-overlapping pixels in one shape) times the computation of calculating a distance for each pair of points in the mapping, n, giving a total of O( n * n! ) or something similar.
Backtracking: The only "better" solution I could think of was to use backtracking, where I would keep track of the current minimum so far and at any point when I'm evaluating a certain mapping, if I reach or exceed that minimum, I move on to the next mapping. Even this won't do any better than O( n! ).
Is there any way to solve this problem with a reasonable complexity?
Also note that the "obvious" approach of simply mapping a point to it's closest matching neighbour does not always yield the optimum solution.
Simpler Approach?: As a secondary question, if a feasible solution doesn't exist, one possibility might be to partition each non-overlapping section into small regions, and map these regions, greatly reducing the number of mappings. To calculate the distance between two regions I would use the center of mass (average of the pixel locations in the region). However, this presents the problem of how I should go about doing the partitioning in order to get a near-optimal answer.
Any ideas are appreciated!!

This is the Minimum Matching problem, and you are correct that it is a hard problem in general. However for the 2D Euclidean Bipartite Minimum Matching case it is solvable in close to O(n²) (see link).
For fast approximations, FryGuy is on the right track with Simulated Annealing. This is one approach.
Also take a look at Approximation algorithms for bipartite and non-bipartite matching in the plane for a O((n/ε)^1.5*log^5(n)) (1+ε)-randomized approximation scheme.

You might consider simulated annealing for this. Start off by assigning A[x] -> B[y] for each pixel, randomly, and calculate the sum of squared distances. Then swap a pair of x<->y mappings, randomly. Then choose to accept this with probability Q, where Q is higher if the new mapping is better, and tends towards zero over time. See the wikipedia article for a better explanation.

Sort pixels in shape A: in increasing order of 'x' and then 'y' ordinates
Sort pixels in shape B: in decreasing order of 'x' and then increasing 'y'
Map pixels at the same index: in the sorted list the first pixel in A will map to first pixel in B. Is this not the mapping you are looking for?

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio