Who created the recursive division maze algorithm? - algorithm

I am working on a project that uses a form of the recursive division algorithm that is normally used to create a fractal-like maze. Now I would like to cite the creator/author of this algorithm to give them credit, but I am unsure who invented it.
Jamis Buck has a very famous blog entry https://weblog.jamisbuck.org/2011/1/12/maze-generation-recursive-division-algorithm , that is titled: "A novel method for generating fractal-like mazes is presented, with sample code and an animation" but did he actually came up with that idea? I do not understand his post as that he invented it, to me it sounds more like he is just describing it.
I was not able to find a clear publication/author through Google and Wikipedia(en) or any other specifically maze-focused site. Does anyone know about the origin of this algorithm?

No, the algorithm is not mine; I learned it from Walter Pullen’s page, here: https://www.astrolog.org/labyrnth/algrithm.htm. He also does not cite an author for this algorithm. It should be noted that many maze algorithms are adaptations of more general graph algorithms, so it could be that this one has an analog in graph theory somewhere...
At any rate, it’s not mine. I used the word “novel” in my post in the second sense given here (https://www.merriam-webster.com/dictionary/novel) “original or striking especially in conception or style”, not in the sense of “new”.

The recursive division algorithm was invented and named by John Perry, who added it to Wikipedia's Maze generation algorithm page in November 2006.
To assist with questions like this, Walter Pullen's Maze generation algorithms page has been updated to add a "Source" column to the table of Maze generation and solving algorithms, to list who invented, named, or otherwise popularized each algorithm. Yes, it is true that many Maze algorithms are adaptations of general graph theory algorithms, and that many algorithms are similar to or variations on others, etc.
For example, recursive division is a type of "nested Mazes", "fractal Mazes", or "Mazes by subdivision", which means generating smaller Mazes within each cell of a larger Maze. That is an older and more general concept, and others (for example the Maze generation program Daedalus) have done nested cell Mazes since 2003. However, it would always divide the Maze into the SAME sized submazes, e.g. a 2x2 equal sized cell Maze with equal sized 2x2 cell Mazes within each cell, with the process repeated 6 times.
John Perry introduced the idea of dividing a Maze into RANDOM sized 2x2 cell Mazes instead of just fixed sized 2x2 cell Mazes, which can generate a Maze of any dimensions instead of just 2^N power passages as with nested cell. The result looks a bit more organic (like a tree trunk) instead of grid based (like a Borg cube). Note that the implementation of recursive division in Daedalus is a bit more generalized, in that Daedalus divides the Maze into either 1x2 or 2x1 cell Mazes (option chosen randomly too) which looks even more random and organic. Also, note that both same and random sized division ("nested cell" algorithm and "recursive division" algorithm) can be implemented in a virtual manner to produce virtual Mazes of enormous size (in which not all of the Maze has to be in memory at once).

Related

SGM Disparity subpixel estimation - how to?

Some weeks ago I've implemented a simple block matching stereo algorithm but the results had been bad. So I've searched on the Internet to find better algorithms. There I found the semi global matching (SGM), published by Heiko Hirschmueller. It gets one of the best results in relation to its processing time.
I've implemented the algorithm and got really good results (compared to simple block matching) as you can see here:
I've reprojected the 2D points to 3D by using the calculated disparity values with the following result
At the end of SGM I have an array with aggregated costs for each pixel. The disparity is equivalent to the index with the lowest cost value.
The problem is, that searching for the minimum only returns discrete values. This results in individually layers in the point-cloud. In other words: Round surfaces are cut into many layers (see point cloud).
Heiko mentioned in his paper, that it would be easy to get sub-pixel accuracy by fitting a polynomial function into the cost array and take the lowest point as disparity.
The problem is not bound to stereo vision, so in other words the task is the following:
given: An array of values, representing a polynomial function.
wanted: The lowest point of the polynomial function.
I don't have any idea how to do this. I need a fast algorithm, because I have to run this code for every pixel in the Image
For example: 500x500 Pixel with 60-200 costs each => Algorithm has to run 15000000-50000000 times!!).
I don't need a real time solution! My current SGM implementation (L2R and R2L matching, no cuda or multi-threading yet) takes about 20 seconds to process an image with 500x500 pixels ;).
I don't ask for libraries! I try to implement my own independent computer vision library :).
Thank you for your help!
With kind regards,
Andreas
Finding the exact lowest point in a general polynomial is a hard problem, since it is equivalent to finding the root of the derivative of the polynomial. In particular, if your polynomial is of degree 6, the derivative is a quintic polynomial, which is known not to be solvable by radical. You therefore need to either: fit the function using restricted families for which computing the roots of the derivatives e.g. the integrals of prod_i(x-ri)p(q) where deg(p)<=4, OR
using an iterative method to find an APPROXIMATE minimum, (newton's method, gradient descent).

How do I guarantee that a cellular automata generated maze is solvable/interesting?

I am writing a maze generation algorithm, and this wikipedia article caught my eye. I decided to implement it in java, which was a cinch. The problem I am having is that while a maze-like picture is generated, the maze often is not solvable and is not often interesting. What I mean by interesting is that there are a vast number of unreachable places and often there are many solutions.
I implemented the 1234/3 rule (although is is changeable easily, see comments for an explanation) with a roughly 50/50 distribution in the start. The mazes always reach an equilibrium where there is no change between t-steps.
My question is, is there a way to guarantee the mazes solvability from a fixed start and endpoint? Also, is there a way to make the maze more interesting to solve (fewer/one solution and few/no unreachable places)? If this is not possible with cellular automata, please tell me. Thank you.
I don't think it's possible to ensure a solvable, interesting maze through simple cellular automata, unless there's some specific criteria that can be placed on the starting state. The fact that cells have no knowledge of the overall shape because each cell won't be able to coordinate with the group as a whole.
If you're insistent on using them, you could do some combination of modification and pathfinding after generation is finished, but other methods (like the ones shown in the Wikipedia article or this question) are simpler to implement and won't result in walls that take up a whole cell (unless you want that).
the root of the problem is that "maze quality" is a global measure, but your automaton cells are restricted to a very local knowledge of the system.
to resolve this, you have three options:
add the global information from outside. generate mazes using the automaton and random initial data, then measure the maze quality (eg using flood fill or a bunch of other maze solving techniques) and repeat until you get a result you like.
use a much more complex set of explicit rules and state. you can work out a set of rules / cell values that encode both the presence of walls and the lengths / quality of paths. for example, -1 would be a wall and a positive value would be the sum of all neighbours above and to the left. then positive values encode the path distance from top left, roughly. that's not enough, but it shows the general idea... you need to encode an algorithm about the maze "directly" in the rules of the system.
use a less complex, but still turing complete, set of rules, and encode the rules for maze generation in the initial state. for example, you could use conway's life and construct an initial state that is a "program" that implements maze generation via gliders etc etc.
if it helps any you could draw a parallel between the above and:
ghost in the machine / external user
FPGA
programming a general purpose CPU
Run a path finding algorithm over it. Dijkstra would give you a sure way to compute all solutions. A* would give you one good solution.
The difficulty of a maze can be measured by the speed at which these algorithms solve it.
You can add some dead-ends in order to shut down some solutions.

Lack of diversification, is it really a drawback of Genetic Algorithms?

We know that Genetic Algorithms (or evolutionary computation) work with an encoding of the points in our solution space Ω rather than these points directly. In the literature, we often find that GAs have the drawback : (1) since many chromosomes are coded into a similar point of Ω or similar chromosomes have very different points, the efficiency is quite low. Do you think that is really a drawback ? because these kind of algorithms uses the mutation operator in each iteration to diversify the candidate solutions. To add more diversivication we simply increase the probability of crossover. And we mustn't forget that our initial population ( of chromosones ) is randomly generated ( another more diversification). The question is, if you think that (1) is a drawback of GAs, can you provide more details ? Thank you.
Mutation and random initialization are not enough to combat the problem that is known as genetic drift which is the major problem of genetic algorithms. Genetic drift means that the GA may quickly lose most of its genetic diversity and the search proceeds in a way that is not beneficial for crossover. This is because the random initial population quickly converges. Mutation is a different thing, if it is high it will diversify, true, but at the same time it will prevent convergence and the solutions will remain at a certain distance to the optimum with higher probability. You will need to adapt the mutation probability (not the crossover probability) during the search. In a similar manner the Evolution Strategy, which is similar to a GA, adapts the mutation strength during the search.
We have developed a variant of the GA that is called OffspringSelection GA (OSGA) which introduces another selection step after crossover. Only those children will be accepted that surpass their parents' fitness (the better, the worse or any linearly interpolated value). This way you can even use random parent selection and put the bias on the quality of the offspring. It has been shown that this slows the genetic drift. The algorithm is implemented in our framework HeuristicLab. It features a GUI so you can download and try it on some problems.
Other techniques that combat genetic drift are niching and crowding which let the diversity flow into the selection and thus introduce another, but likely different bias.
EDIT: I want to add that the situation of having multiple solutions with equal quality might of course pose a problem as it creates neutral areas in the search space. However, I think you didn't really mean that. The primary problem is genetic drift, ie. the loss of (important) genetic information.
As a sidenote, you (the OP) said:
We know that Genetic Algorithms (or evolutionary computation) work with an encoding of the points in our solution space Ω rather than these points directly.
This is not always true. An individual is coded as a genotype, which can have any shape, such as a string (genetic algorithms) or a vector of real (evolution strategies). Each genotype is transformed into a phenotype when assessing the individual, i.e. when its fitness is calculated. In some cases, the phenotype is identical to the genotype: it is called direct coding. Otherwise, the coding is called indirect. (you may find more definitions here (section 2.2.1))
Example of direct encoding:
http://en.wikipedia.org/wiki/Neuroevolution#Direct_and_Indirect_Encoding_of_Networks
Example of indirect encoding:
Suppose you want to optimize the size of a rectangular parallelepiped dened by its length, height and width. To simplify the example, assume that these three quantities are integers between 0 and 15. We can then describe each of them using a 4-bit binary number. An example of a potential solution may be to genotype 0001 0111 01010. The corresponding phenotype is a parallelepiped of length 1, height 7 and width 10.
Now back to the original question on diversity, in addition to what DonAndre said you could read you read chapter 9 "Multi-Modal Problems and Spatial Distribution" of the excellent book Introduction to Evolutionary Computing written by A. E. Eiben and J. E. Smith. as well as a research paper on that matter such as Encouraging Behavioral Diversity in Evolutionary Robotics: an Empirical Study. In a word, diversity is not a drawback of GA, it is "just" an issue.

Pairwise alignment of long string sequences

I want to find the globally optimal (or close to optimal) pairwise alignment between two long (tens of thousands) sequences of strings, but the algorithm is expected to operate on any object sequences.
I also want to use my own distance function implementation to compute the similarity of two objects. For shorter sequences, I could use the dynamic time warping (DTW) algorithm but the DTW algorithm needs to compute and store a n*m distance matrix (n,m are lengths of the sequences) which is not feasible for longer sequences. Can you recommend such algorithm? A working implementation would be a plus.
The following example clarifies what the algorithm needs to do:
Input:
Sequence A: i saw the top of the mountain
Sequence B: then i see top of the mountains
Result:
Sequence A: i saw the top of the mountain
Sequence B: then i see top of the mountains
I don't know, if I understood your requirements right, but it sounds to me like you are trying to solve a Stable Marriage Problem. The original solution of Gale and Shapley in the link is in O(n*m) time and O(n+m) space, if my memory serves me right. The implementation was pretty straightforward.
There are also some later solutions with different variants of the problem.
You also could try to solve this problem by using Maximum Bipartite Graph Matching, e.g. here, for getting a different optimality criterion. This also can be done in O(n*m).

Algorithm to create hex flood puzzle

I'm creating a puzzle game that, while playable by hand for easy levels, is meant to be solved by computer programs for harder ones. The puzzle is a flood fill on a hexagonal board. You can try a prototype here.
(source: hacker.org)
Here is how the puzzle works: by choosing a color from the top, you perform a flood fill starting from the upper-left tile. This progressively converts the board to a solid color. The challenge is to do this in a certain number of moves.
I have created several puzzles similar to this, and the key is to use an algorithm that generates boards that are hard to solve without knowing how they were created. For example, here we might produce a board by reversing the flood fill: working backwards from a solid board until it has been unflooded. We know how many steps this took, and can set this as a lower bound on a solution.
The problem I'm facing is that when I try this approach, my upper-bound is way too high. It becomes trivial to solve the puzzle within this number of moves, even by moving randomly.
An approach that is not a solution is generating a random board and then solving it optimally and setting this as the target. The point is to create a puzzle where solving it optimally is NP time or at least a hard P.
So what I'm looking for is an algorithm that can generate extremely hard boards where solving them, as they get larger, becomes a serious challenge.
When doing RSA encryption, we do not find prime numbers, we select random numbers and then apply tests to them that give us increasingly high probability of the number being prime, withou ever proving it.
I suggest the same. Try to find conditions which that give a good likelyhood of the puzzle having the desired properties, and testing for those. Or you could use genetic algorithms/neural networks and train them to recognize the "good" puzzles, which amounts to the same thing.
I would try to prove that it is NP-complete or in P, to get a feel for the configurations which are difficult.
I'd also abstract away the hexagons and use a representation as a graph.
I've played the rectangular flood puzzle a lot (http://labpixies.com/gadget_page.php?id=10). Excited to see a Hex version! I think finding a hard game is as easy as: avoid large blocks of the same color to appear in the puzzle. At least in the rectangular cases I've seen, nearly all the puzzles that can be solved in a small number of steps have large color blocks.
P.S. I think your "lower bound" is not valid. When working forwards, if a good strategy is used, you could actually finish in fewer steps. The "lower bound" is really an upper bound for the optimum solution.

Resources