issue with Ramer–Douglas–Peucker algorithm while drawing a line - algorithm

I am developing a painting application for the iOS and to get smooth lines i apply the Ramer–Douglas–Peucker algorithm of the samples points.
The algorithm works on the whole vector of points and the result changes as points are added. It causes the result curve to "jump" while user paints.
Is there a known solution to this problem?

I've never implemented or used this algorithm, but I can think of two possible solutions:
Apply the algorithm to discrete sections of the line. That is, wait until the user has drawn 10 points, then run the algorithm on points 0..9. Then wait until the user has drawn the next 10 points and run the algorithm on points 10..19, and so on. One possible caveat is that it could create side-effects at points 10, 20, etc., but I really don't know if it would be noticeable to the user.
Wait until the user is done drawing, then run the algorithm once on the whole line. I've seen this approach used in apps before.
Both of these have the advantage that you're running the algorithm on each point no more than twice (and exactly once in the latter case), whereas if you run the algorithm every time a point is added you end up running it on every previous point every time you add a point, which could have a performance penalty.
Like I said, this isn't an area of expertise for me, but I hope it gives you some ideas.

I doubt this can be avoided at all, for a simple reason: the algorithm cannot guess the future points.
Imagine that you draw the first two points; obviously you will keep them. Now move to a third point. If R-D-P tells you to discard the middle point, you may not because that would cause a jump. And so on. Disallowing the jumps implies that you disallow any deletion !
Maybe you can lessen the psychological effect by drawing both the raw curve, which remains stable, and the smoothed one.
This said, R-D-P maybe not be the best approach for smoothing.

Related

How does Particle Swarm Optimization reach a final solution?

I understand that each particle is a solution to a specific function, and each particle and the swarm is constantly searching for the best solution. If the global best is found after the first iteration, and no new particles are being added to the mix, shouldn't the loop just quit and the first global best found be the most fitting solution? If this is the case what makes PSO better than just iterating through a list.
Your terminology is a bit off. Simple PSO is a search for a vector x that minimizes some scalar objective function E(x). It does this by creating many candidate vectors. Call them x_i. These are the "particles". They are initialized randomly in both position and rate of change, also called velocity, which is consistent with the idea of a moving particle, even though that particle may have many more than 3 dimensions.
Simple rules describe how the position and velocity change over time. The rules are chosen so that each particle x_i tends randomly to move in directions that reduce E(x_i).
The rules usually involve tracking the "single best x_i value seen so far" and are tuned so that all particles tend to head generally toward that best value with random variations. So the particles swarm like buzzing bees, heading as a group toward a common goal, but with many deviations by individual bees that, over time, cause the common goal to change.
It's unfortunate that some of the literature calls this goal or best particle value seen so far "the global minimum." In optimization, global minimum has a different meaning. A global minimum (there can be more than one when there are "ties" for best) is a value of x that - out of the entire domain of possible x values - produces the unique minimum possible value of E(x).
In no way is PSO guaranteed to find a global minimum. In fact, your question is a bit nonsensical in that one generally never knows when a global minimum has been found. How would you? In most problems you don't even know the gradient of E (which gives the direction taking E to smaller values, i.e. downhill). This is why you are using PSO in the first place. If you know the gradient, you can almost certainly use numerical techniques that will find an answer more quickly than PSO. Without a gradient, you can't even be sure you've found a local minimum, let alone a global one.
Rather, the best you can usually do is "guess" when a local minimum has been found. You do this by letting the system run while watching how often and by how much the "best particle seen so far" is being updated. When the changes become infrequent and/or small, you declare victory.
Another way of putting this is that PSO is used on problems where reducing E(x) is always good and "you'll take anything you can get" regardless of whether you have any confidence that what you got is the best possible. E.g. you're Walmart and any way of locating your stores that saves/makes more dollars is interesting.
With all this as background, let's recap your specific questions:
If the global best is found after the first iteration, and no new particles are being added to the mix, shouldn't the loop just quit and the first global best found be the most fitting solution?
There's no answer because there's no way to determine a global best has been found. The swarm of buzzing particles might find a new best in the next iteration or ten trillion iterations from now. You seldom know.
If this is the case what makes PSO better than just iterating through a list?
I don't exactly grok what you mean by this. The PSO is emulating the way swarms of biological entities like bugs and herd animals behave. In this manner it resembles genetic algorithms, simulated annealing, neural networks, and other families of solution finders that use the following logic: Nature, both physical and biological, has known-good optimization processes. Let's take advantage of them and do our best to emulate them in software. We are using nature to do better than any simple iteration we might devise ourselves.
Given a function, a particle swarm attempts to find the solution (a vector) that will minimize (or sometimes maximize, depending on the problem) the value to that function.
If you happen to know the minimum of the solution (suppose for argument sake, it is 0) AND
if you are lucky enough to generate the solution that gives you 0 on the first step, then you can exit the loop and stop the algorithm.
That said; the probability of you randomly generating that solution on initialization is infinitely small.
In most practical terms, when you would want to use a PSO to solve, it is most likely that you will not know the minimum value, so you wont be able to use that as a stopping condition.
The particle swarm optimization, the optimization process is not in the way the random initial step occurs, but rather the modification that occurs by adapting the initial solution with the velocity determined by social and cognitive component.
The social component consists of the current evaluated global best solution of the swarm
The cognitive component consists of a the best location seen by the current solution.
This adjustment will move the particle along a line between the global best and the current best - in hope there is a better solution between them.
I hope that answers the question in some way
Just to add some piece in answering, your problem seems to be linked to the common issue of "when should I stop my PSO?" A question everyone is faced when launching a swarm since (as clearly explained above) you never know if you reached the global best solution (except in very specific objective functions).
Usual tricks already present in most PSO implementation:
1- just limit a number of iterations since there is always a limit in processing time (and you could implement different ways to convert the iterations number into a time limit by self assessment of time spent to evaluate the objective).
2- stop the algorithm when the progress in optimization starts to be insignificant.

Algorithm(s) for finding moving entities in a maze

A have a maze and character that's controlled by the player and a drone who has to find him (by itself). Does anyone know an (efficient) AI algorithm for doing something like this?
P.S. I know there are several path finding algorithms(e.g. A*), but as far as I know these only work for finding the path between two nodes that "don't move" (this would work if my character was standing still, but that's obviously not the case).
If the "start point" is where the drone is, and the "end point" is to run into the player, about the best you can do using just a "standard" algorithm is to use A* periodically and from that determine where the drone needs to move.
As you get closer to the player, you will be calculating faster and faster since the search space is, in theory, smaller.
Using this, it would be possible for the player to find a set of positions that, when moving between them causes the drone to get "stuck" just moving back and forth, but those sorts of optimizations are situation-specific and a general algorithm won't include them.
Essentially, you do have a fixed search space each "frame", but you just have to run it each frame to decide what to do.
There are likely tweaks to A* that cover minor perturbations between runs, but I don't know any off the top of my head.

Robot exploration algorithm

I'm trying to devise an algorithm for a robot trying to find the flag(positioned at unknown location), which is located in a world containing obstacles. Robot's mission is to capture the flag and bring it to his home base(which represents his starting position). Robot, at each step, sees only a limited neighbourhood (he does not know how the world looks in advance), but he has an unlimited memory to store already visited cells.
I'm looking for any suggestions about how to do this in an efficient manner. Especially the first part; namely getting to the flag.
A simple Breadth First Search/Depth First Search will work, albeit slowly. Be sure to prevent the bot from checking paths that have the same square multiple times, as this will cause these algorithms to run much longer in standard cases, and indefinitely in the case of the flag being unable to be reached.
A* is the more elegant approach, especially if you know the location of the flag relative to yourself. Wikipedia, as per usual, does a decent job with explaining it. The classic heuristic to use is the manning distance (number of moves assuming no obstacles) to the destination.
These algorithms are useful for the return trip - not so much the "finding the flag" part.
Edit:
These approaches involve creating objects that represents squares on your map, and creating "paths" or series of square to hit (or steps to take). Once you build a framework for representing your square, the problem of what kind of search to use becomes a much less daunting task.
This class will need to be able to get a list of adjacent squares and know if it is traversable.
Considering that you don't have all information, try just treating unexplored tiles as traversable, and recomputing if you find they aren't.
Edit:
As for seaching an unknown area for an unknown object...
You can use something like Pledge's algorithm until you've found the boundaries of your space, recording all information as you go. Then go have a look at all unseen squares using your favorite drift/pathfinding algorithm. If, at any point long the way, you see the flag, stop what you're doing and use your favorite pathfinding algorithm to go home.
Part of it will be pathfinding, for example with the A* algorithm.
Part of it will be exploring. Any cell with an unknown neighbour is worth exploring. The best cells to explore are those closest to the robot and with the largest unexplored neighbourhood.
If the robot sees through walls some exploration candidates might be inaccessible and exploration might be required even if the flag is already visible.
It may be worthwhile to reevaluate the current target every time a new cell is revealed. As long as this is only done when new cells are revealed, progress will always be made.
With a simple DFS search at least you will find the flag:)
Well, there are two parts to this.
1) Searching for the Flag
2) Returning Home
For the searching part, I would circle the home point moving outward every time I made a complete loop. This way, you can search every square and idtentify if it is a clear spot, an obstacle, map boundary or the flag. This way, you can create a map of your environment.
Once the Flag is found, you could either go back the same way, or find a more direct route. If it is more direct route, then you would have to use the map which you have created to find a direct route.
What you want is to find all minimal-spanning-tree in the viewport of the robot and then let the robot game which mst he wants to travel.
If you met an obstacle, you can go around to determine its precise dimensions, and after measuring it return to the previous course.
With no obstacles in the range of sight you can try to just head in the direction of the nearest unchecked area.
It maybe doesn't seem the fastest way but, I think, it is the good point to start.
I think the approach would be to construct the graph as the robot travels. There will be a function that will return to the robot the particular state of a grid. This is needed since the robot will not know in advance the state of the grid.
You can apply heuristics in the search so the probability of reaching the flag is increased.
As many have mentioned, A* is good for global planning if you know where you are and where your goal is. But if you don't have this global knowledge, there is a class of algorithms call "bug" algorithms that you should look into.
As for exploration, if you want to find the flag the fastest, depending on how much of the local neighborhood your bot can see, you should try to not have this neighborhood overlap. For example if your bot can see one cell around it in every direction, you should explore every third column. (columns 1, 4, 7, etc.). But if the bot can only see the cell it is currently occupying, then the most optimal thing you can do is to not go back over what you already visited.

Multiple parameter optimization with lots of local minima

I'm looking for algorithms to find a "best" set of parameter values. The function in question has a lot of local minima and changes very quickly. To make matters even worse, testing a set of parameters is very slow - on the order of 1 minute - and I can't compute the gradient directly.
Are there any well-known algorithms for this kind of optimization?
I've had moderate success with just trying random values. I'm wondering if I can improve the performance by making the random parameter chooser have a lower chance of picking parameters close to ones that had produced bad results in the past. Is there a name for this approach so that I can search for specific advice?
More info:
Parameters are continuous
There are on the order of 5-10 parameters. Certainly not more than 10.
How many parameters are there -- eg, how many dimensions in the search space? Are they continuous or discrete - eg, real numbers, or integers, or just a few possible values?
Approaches that I've seen used for these kind of problems have a similar overall structure - take a large number of sample points, and adjust them all towards regions that have "good" answers somehow. Since you have a lot of points, their relative differences serve as a makeshift gradient.
Simulated
Annealing: The classic approach. Take a bunch of points, probabalistically move some to a neighbouring point chosen at at random depending on how much better it is.
Particle
Swarm Optimization: Take a "swarm" of particles with velocities in the search space, probabalistically randomly move a particle; if it's an improvement, let the whole swarm know.
Genetic Algorithms: This is a little different. Rather than using the neighbours information like above, you take the best results each time and "cross-breed" them hoping to get the best characteristics of each.
The wikipedia links have pseudocode for the first two; GA methods have so much variety that it's hard to list just one algorithm, but you can follow links from there. Note that there are implementations for all of the above out there that you can use or take as a starting point.
Note that all of these -- and really any approach to this large-dimensional search algorithm - are heuristics, which mean they have parameters which have to be tuned to your particular problem. Which can be tedious.
By the way, the fact that the function evaluation is so expensive can be made to work for you a bit; since all the above methods involve lots of independant function evaluations, that piece of the algorithm can be trivially parallelized with OpenMP or something similar to make use of as many cores as you have on your machine.
Your situation seems to be similar to that of the poster of Software to Tune/Calibrate Properties for Heuristic Algorithms, and I would give you the same advice I gave there: consider a Metropolis-Hastings like approach with multiple walkers and a simulated annealing of the step sizes.
The difficulty in using a Monte Carlo methods in your case is the expensive evaluation of each candidate. How expensive, compared to the time you have at hand? If you need a good answer in a few minutes this isn't going to be fast enough. If you can leave it running over night, it'll work reasonably well.
Given a complicated search space, I'd recommend a random initial distributed. You final answer may simply be the best individual result recorded during the whole run, or the mean position of the walker with the best result.
Don't be put off that I was discussing maximizing there and you want to minimize: the figure of merit can be negated or inverted.
I've tried Simulated Annealing and Particle Swarm Optimization. (As a reminder, I couldn't use gradient descent because the gradient cannot be computed).
I've also tried an algorithm that does the following:
Pick a random point and a random direction
Evaluate the function
Keep moving along the random direction for as long as the result keeps improving, speeding up on every successful iteration.
When the result stops improving, step back and instead attempt to move into an orthogonal direction by the same distance.
This "orthogonal direction" was generated by creating a random orthogonal matrix (adapted this code) with the necessary number of dimensions.
If moving in the orthogonal direction improved the result, the algorithm just continued with that direction. If none of the directions improved the result, the jump distance was halved and a new set of orthogonal directions would be attempted. Eventually the algorithm concluded it must be in a local minimum, remembered it and restarted the whole lot at a new random point.
This approach performed considerably better than Simulated Annealing and Particle Swarm: it required fewer evaluations of the (very slow) function to achieve a result of the same quality.
Of course my implementations of S.A. and P.S.O. could well be flawed - these are tricky algorithms with a lot of room for tweaking parameters. But I just thought I'd mention what ended up working best for me.
I can't really help you with finding an algorithm for your specific problem.
However in regards to the random choosing of parameters I think what you are looking for are genetic algorithms. Genetic algorithms are generally based on choosing some random input, selecting those, which are the best fit (so far) for the problem, and randomly mutating/combining them to generate a next generation for which again the best are selected.
If the function is more or less continous (that is small mutations of good inputs generally won't generate bad inputs (small being a somewhat generic)), this would work reasonably well for your problem.
There is no generalized way to answer your question. There are lots of books/papers on the subject matter, but you'll have to choose your path according to your needs, which are not clearly spoken here.
Some things to know, however - 1min/test is way too much for any algorithm to handle. I guess that in your case, you must really do one of the following:
get 100 computers to cut your parameter testing time to some reasonable time
really try to work out your parameters by hand and mind. There must be some redundancy and at least some sanity check so you can test your case in <1min
for possible result sets, try to figure out some 'operations' that modify it slightly instead of just randomizing it. For example, in TSP some basic operator is lambda, that swaps two nodes and thus creates new route. Your can be shifting some number up/down for some value.
then, find yourself some nice algorithm, your starting point can be somewhere here. The book is invaluable resource for anyone who starts with problem-solving.

another Game of Life question (infinite grid)?

I have been playing around with Conway's Game of life and recently discovered some amazingly fast implementations such as Hashlife and Golly. (download Golly here - http://golly.sourceforge.net/)
One thing that I cant get my head around is how do coders implement the infinite grid? We can't keep an infinite array of anything, if you run golly and get a few gliders to fly off past the edges, wait for a few mins and zoom right out, you will see the gliders still there out in space running away, so how in gods name is this concept of infinity dealt with programmatically? Is there a well documented pattern or what?
Many thanks
It is possible to represent living nodes with some type of sparse matrix in this situation. For instance, if we store a list of (LivingNode, Coordinate) pairs instead of an array of Nodes where each is either living or dead, we are simply changing the Coordinates rather than increasing an array's size. Thus, the space required for this is proportional to the number of LivingNodes.
This solution doesn't work for states where the number of living nodes is constantly increasing, but it works very well for gliders.
EDIT: So that was off the top of my head. Turns out Wikipedia has an article that shows a much more well-thought out solution. Oh well! :) Enjoy.
Wikipedia explains it.
The basic idea is that Conway's Game of Life exhibits locality, since information travels at a slow speed compared to the pattern size and the maximum density of filled cells is around 1/2 of the cells in any region. (More will kill off cells due to overcrowding.)
Since there is locality, you can separate the field in different sections and simulate each section independently. If you choose your locality well, you will often see the same patterns. You can simulate how those evolve and store the results in a lookup table, so that other instances of the same pattern do not need to be simulated more than once. Combining adjacent patterns into larger 'metapatterns' allows you to precalculate those as well, and so on.

Resources