How to simplify a spline? - algorithm

I have an interesting algorithmic challenge in a project I am working on. I have a sorted list of coordinate points pointing at buildings on either side of a street that, sufficiently zoomed in, looks like this:
I would like to take this zigzag and smooth it out to linearize the underlying street.
I can think of a couple of solutions:
Calculate centroids using rolling averages of six or so points, and use those.
Spline regression.
Is there a better or best way to approach this problem? (I am using Python 3.5)

Based on your description and your comments, you are looking for a line simplification algorithms.
Ramer-Doublas algorithm (suggested in the comment) is most probably the most well-known algorithm in this family, but there are many more.
For example Visvalingam’s algorithm works by removing the point with the smallest change, which is calculated by the smallest square of the triangle. This makes it super easy to code and intuitively understandable. If it is hard to read research paper, you can read this easy article.
Other algorithms in this family are:
Opheim
Lang
Zhao
Read about them, understand what are they trying to minify and select the most suitable for you.

Dali's post correctly surmises that a line simplification algorithm is useful for this task. Before posting this question I actually examined a few such algorithms but wasn't quite comfortable with them because even though they resulted in the simplified geometry that I liked, they didn't directly address the issue I had of points being on either side of the feature and never in the middle.
Thus I used a two-step process:
I computed the centroids of the polyline by using a rolling average of the coordinates of the five surrounding points. This didn't help much with smoothing the function but it did mostly succeed in remapping them to the middle of the street.
I applied Visvalingam’s algorithm to the new polyline, with n=20 points specified (using this wonderful implementation).
The result wasn't quite perfect but it was good enough:
Thanks for the help everyone!

Related

Procedural Maze Algorithm With Cells Determined Independently of Neighbors

I was thinking about maze algorithms recently (mostly because I'm working on a game, but I felt this is a more general question than game development related). In simple terms, I was wondering if there is a sort of maze algorithm that can generate (a possibly infinite number of) cells without any information specifically about the cell's neighbors. I imagine, if such a thing were possible, it would rely heavily upon noise functions such as Perlin or Simplex.
Each cell has four walls, these are used when actually rendering the maze so that corridors and walls are not the same thickness.
Let's say, for example, I'd like a cell at (32, 15) to generate its walls.
I know of algorithms like Ellers (which requires a limited number of columns, but infinite rows) and the Virtual fractal Mazes algorithm (which needs to know previous cells in order to build upon them infinitely in both x and y directions).
Does anyone know of any algorithm I could look into for this specific request? If not, are there any algorithms that are good for chunk-based mazes that you know of?
(Note: I did search around for a bit through StackOverflow to see if there were any questions with similar requests to mine, but I did not come across any. If you happen to know of one, a link would be greatly appreciated :D)
Thank you in advance.
Seeeeeecreeeets. My preeeeciooouss secretts. But yeah I can understand the frustration so I'll throw this one to you OP/SO. Feel free to update the PCG Wiki if you're not as lazy as me :3
There are actually many ways to do this. Some of the best techniques for procgen are:
Asking what you really want.
Design backwards. Play in reverse. Result is forwards.
Look at a random sampling of your target goal and try to see overall patterns.
But to get back to the question, there are two simple ways and they both start from asking what your really want. I'll give those first.
The first is to create 2 layers. Both are random noise. You connect the top and the bottom so they're fully connected. No isolated portions. This is asking what you really want which is connected-ness. And then guaranteeing it in a local clean-up step. (tbh I forget the function applied to layer 2 that guarantees connected-ness. I can't find the code atm.... But I remember it was a really simple local function... XOR, Curl, or something similar. Maybe you can figure it out before I fix this).
The second way is using the properties of your functions. As long as your random function is smooth enough you can take the gradient and assign a tile to it. The way you assign the tiles changes the maze structure but you can guarantee connectivity by clever selection of tiles for each gradient (b/c similar or opposite gradients are more likely to be near each other on a smooth gradient function). From here your smooth random can be any form of Perlin Noise, etc. Once again a asking what you want technique.
For backwards-reversed you unfortunately have an NP problem (I'm not sure if it's hard, complete, or whatever it's been a while since I've worked on this...). So while you could generate a random map of distances down a maze path. And then from there generate the actual obstacles... it's not really advisable. There's also a ton of consideration on different cases even for small mazes...
012
123
234
Is simple. There's a column in the lower right corner of 0 and the middle 2 has an _| shaped wall.
042
123
234
This one makes less sense. You still are required to have the same basic walls as before on all the non-changed squares... But you can't have that 4. It needs to be within 1 of at least a single neighbor. (I mean you could have a +3 cost for that square by having something like a conveyor belt or something, but then we're out of the maze problem) Okay so....
032
123
234
Makes more sense but the 2 in the corner is nonsense once again. Flipping that from a trough to a peak would give.
034
123
234
Which makes sense. At any rate. If you can get to this point then looking at local neighbors will give you walls if it's +/-1 then no wall. Otherwise wall. Also note that you can break the rules for the distance map in a consistent way and make a maze just fine. (Like instead of allowing a column picking a wall and throwing it up. This is just loop splitting at this point and should be safe)
For random sampling as the final one that I'm going to look at... Certain maze generation algorithms in the limit take on some interesting properties either as an average configuration or after millions of steps. Some form Voronoi regions. Some form concentric circles with a randomly flipped wall to allow a connection between loops. Etc. The loop one is good example to work off of. Create a set of loops. Flip a random wall on each loop. One will delete a wall which will create access to the next loop. One will split a path and offer a dead-end and a continuation. For a random flip to be a failure there has to be an opening and a split made right next to each other (unless you allow diagonals then we're good). So make loops. Generate random noise per loop. Xor together. Replace local failures with a fixed path if no diagonals are allowed.
So how do we get random noise per loop? Or how do we get better loops than just squares? Just take a random function. Separate divergence and now you have a loop map. If you have the differential equations for the source random function you can pick one random per loop. A simpler way might be to generate concentric circular walls and pick a random point at each radius to flip. Then distort the final result. You have to be careful your distortion doesn't violate any of your path-connected-ness conditions at that point though.

How to find neighboring solutions in simulated annealing?

I'm working on an optimization problem and attempting to use simulated annealing as a heuristic. My goal is to optimize placement of k objects given some cost function. Solutions take the form of a set of k ordered pairs representing points in an M*N grid. I'm not sure how to best find a neighboring solution given a current solution. I've considered shifting each point by 1 or 0 units in a random direction. What might be a good approach to finding a neighboring solution given a current set of points?
Since I'm also trying to learn more about SA, what makes a good neighbor-finding algorithm and how close to the current solution should the neighbor be? Also, if randomness is involved, why is choosing a "neighbor" better than generating a random solution?
I would split your question into several smaller:
Also, if randomness is involved, why is choosing a "neighbor" better than generating a random solution?
Usually, you pick multiple points from a neighborhood, and you can explore all of them. For example, you generate 10 points randomly and choose the best one. By doing so you can efficiently explore more possible solutions.
Why is it better than a random guess? Good solutions tend to have a lot in common (e.g. they are close to each other in a search space). So by introducing small incremental changes, you would be able to find a good solution, while random guess could send you to completely different part of a search space and you'll never find an appropriate solution. And because of the curse of dimensionality random jumps are not better than brute force - there will be too many places to jump.
What might be a good approach to finding a neighboring solution given a current set of points?
I regret to tell you, that this question seems to be unsolvable in general. :( It's a mix between art and science. Choosing a right way to explore a search space is too problem specific. Even for solving a placement problem under varying constraints different heuristics may lead to completely different results.
You can try following:
Random shifts by fixed amount of steps (1,2...). That's your approach
Swapping two points
You can memorize bad moves for some time (something similar to tabu search), so you will use only 'good' ones next 100 steps
Use a greedy approach to generate a suboptimal placement, then improve it with methods above.
Try random restarts. At some stage, drop all of your progress so far (except for the best solution so far), raise a temperature and start again from a random initial point. You can do this each 10000 steps or something similar
Fix some points. Put an object at point (x,y) and do not move it at all, try searching for the best possible solution under this constraint.
Prohibit some combinations of objects, e.g. "distance between p1 and p2 must be larger than D".
Mix all steps above in different ways
Try to understand your problem in all tiniest details. You can derive some useful information/constraints/insights from your problem description. Assume that you can't solve placement problem in general, so try to reduce it to a more specific (== simpler, == with smaller search space) problem.
I would say that the last bullet is the most important. Look closely to your problem, consider its practical aspects only. For example, a size of your problems might allow you to enumerate something, or, maybe, some placements are not possible for you and so on and so forth. THere is no way for SA to derive such domain-specific knowledge by itself, so help it!
How to understand that your heuristic is a good one? Only by practical evaluation. Prepare a decent set of tests with obvious/well-known answers and try different approaches. Use well-known benchmarks if there are any of them.
I hope that this is helpful. :)

Project Euler #163 understanding

I spent quite a long time searching for a solution to this problem. I drew tons of cross-hatched triangles, counted the triangles in simple cases, and searched for some sort of pattern. Unfortunately, I hit the wall. I'm pretty sure my programming/math skills did not meet the prereq for this problem.
So I found a solution online in order to gain access to the forums. I didn't understand most of the methods at all, and some just seemed too complicated.
Can anyone give me an understanding of this problem? One of the methods, found here: http://www.math.uni-bielefeld.de/~sillke/SEQUENCES/grid-triangles (Problem C)
allowed for a single function to be used.
How did they come up with that solution? At this point, I'd really just like to understand some of the concepts behind this interesting problem. I know looking up the solution was not part of the Euler spirit, but I'm fairly sure I would not have solved this problem anyhow.
This is essentially a problem in enumerative combinatorics, which is the art of counting combinations of things. It's a beautiful subject, but probably takes some warming up to before you can appreciate the ninja tricks in the reference you gave.
On the other hand, the comments in the solutions thread for the problem indicate that many have solved the problem using a brute force approach. One of the most common tricks involves taking all possible combinations of three lines in the diagram, and seeing whether they yield a triangle that is inside the largest triangle.
You can cut down the search space considerably by noting that the lines are in one of six directions. Since a combination of lines that includes two lines that are parallel will not yield a triangle, you can iterate over line triples so that each line in the triple has a different direction.
Given three lines, calculate their intersection points. You will have three possibilities
1) the lines are coincident - they all intersect in a common point
2) two of the lines intersect at a point outside the triangle
3) all three points of intersection are distinct, and they all lie within the outer triangle
Just count the combos satisfying condition (3) and you are done. The number of line combos you have to test is O(n3), which is not prohibitive.
EDIT1: rereading your question, I get the impression you might be more interested in getting an explanation of the combinatorics solution/formula than an outline of a brute force approach. If that's the case, say so and I'll delete this answer. But I'd also say that the question in that case would not be suitable for this site.
EDIT2: See also a combinatorics solution by Bill Daly and others. It is mathematically a little gentler than the other one.
I have not solved this problem for project euler and am going off of the question and the solution you provided. In the case of the single function, the methodology presented was ultimately simple pattern finding. The solver broke the presented question into three parts, based on the types of triangles that were present from the intersections. It's a fairly standard aproach to this kind of problem, break the larger pattern down into smaller ones to make solving easier. The functions used to express the various forms of triangles I can only assume were generated with either a very acute pattern finding mind or some number theory / geometry. It is also beyond the scope of this explanation and my knowledge. This problem has nothing to do with programming. It's basically entirely mathematics. If you read through the site you liked you can see the logic that is gone through to reach the questions.

Collision Points in GJK

Is there a way to modify a Gilbert-Johnson-Keerthi Algorithm so it finds points of the collision between two bodies instead of a true/false result ? From what I've understood the received distance value could be used to find these points. I searched the web but didn't find any hints.
What you are asking for is not well-posed. If they are colliding, then a point of intersection is undefined -- since the intersection is actually a region of overlap and thus could be any number of possible points. Instead, you should think about a "point of intersection" as a coordinate in space-time, (dx,dy,dz,t), representing the time of impact, together with a translation vector between the two bodies giving you their relative configurations.
One way to modify GJK to compute a space-time intersection is to do a binary search over the swept volume to find the moment of time right before impact. Using this data, you can compute a separating axis and corresponding extremal points for both bodies, which gives you a close approximation of the point of impact. This approach can also be fast if you reuse the simplices from previous iterations of the search to speed up subsequent tests. Christer Ercisson has some notes on this technique here: http://realtimecollisiondetection.net/pubs/SIGGRAPH04_Ericson_GJK_notes.pdf
This paper covers your question i believe, and is up to date. i'm don't have anycode. and not going to re-explain it, but, the author also has a pres up on YouTube explaining it. working on the code now, and their is very little examples. but this is what you want. you can use the "less effective" way mentioned. in the paper as a. as it will work just fine for your work. unless you goal is extremely high performance.
"Improving the GJK algorithm for faster and more reliable distance queries between convex objects"
MATTIA MONTANARI and NIK PETRINIC University of Oxford
ETTORE BARBIERI Queen Mary University of London
https://ora.ox.ac.uk/objects/uuid:69c743d9-73de-4aff-8e6f-b4dd7c010907/download_file?safe_filename=GJK.PDF&file_format=application%2Fpdf&type_of_work=Journal+article

3 dimensional bin packing algorithms

I'm faced with a 3 dimensional bin packing problem and am currently conducting some preliminary research as to which algorithms/heuristics are currently yielding the best results. Since the problem is NP hard I do not expect to find the optimal solution in every case, but I was wondering:
1) what are the best exact solvers? Branch and Bound? What problem instance sizes can I expect to solve with reasonable computing resources?
2) what are the best heuristic solvers?
3) What off-the-shelf solutions exist to conduct some experiments with?
As far as off the shelf solutions, check out MAXLOADPRO for loading trucks. It may be able to be configured to load any rectangular volume, but I haven't tried that yet. In general 3d bin-packing problems have the added complication that the objects can be rotated into different positions so for any object with a given length, width and height, you effectively have to create three variables representing each position, but you only use one in the solution.
In general, stand-alone MIP formulations (or branch and bound) don't work well for the 2d or 3d problem but constraint programming has met with some success producing exact solutions for the 2d problem. Check out this abstract. Without looking at the paper, I like the decomposition approach for the problem where you're trying to minimize the number of same-sized bins. I haven't seen as many results for the 3d problem, but let us know if you find any that are implementable.
Good luck !
I've written a program which tests three various algorithms. Also this is a good source of information: A Thousand Ways to Pack the Bin - A Practical Approach to Two-Dimensional Rectangle Bin Packing. It is for two-dimensional rectangle bin, but you can always transform it to 3D.
From wikipedia:
Although these simple strategies are often good enough, efficient approximation algorithms have been demonstrated that can solve the bin packing problem within any fixed percentage of the optimal solution for sufficiently large inputs
Here are the two sources they give for this:
Approximation Algorithms
Bin packing can be solved within 1 + ε in linear time
Best exact solver: Use dynamic programming.
State variables:
Items you have packed and discarded.
Space filled in the container.
If the container is a parallelepiped grid, and the items "fit" in exact cells of the grid, you can use a 3-dimensional array to represent state variable 2. Otherwise, you will have to use more complex data structures.
Best heuristic solvers
I don't know. Perhaps Variable Neighborhood Search. There are some similarities between your problem and the timetable construction problem (which I'm working on), so the same heuristic might be good for both.
Off-the-shelf solutions to conduct experiments
I'm sorry, I don't even have a clue.
You question is similar to:
3d bin packing algorithm
Although, because you dis-allow rotation, you can get pretty good results. I suggest looking more towards a FIRST-FIT-DECREASING solution.
3dbinpacking is a commercial solution (not an algorithm) exposing an API to consume with nice visualization. It offers:
Single bin packing
Multi bin packing
Find third dimension
Find a bin dimensions

Resources