For imprecise geoquerying with H3, can we consider pentagons equal to hexagons? - h3

If we use the kRing(H3Index origin, int k, H3Index* out) function to get surrounding hexagons, and we're going to refine the results on the client (i.e. haversine), we never have to worry about pentagons, do we? Because, AFAIK, pentagons have H3 indexes also and are, for all intents and purposes, treated just like hexagons, correct?
It is the hexRange(H3Index origin, int k, H3Index* out) function that cares about pentagons, correct?

For some intents and purposes, yes, pentagons are simply hexagons with one axial dimension removed. They have a normal H3 index representation, will work correctly with most functions that take H3 cells as input, and should be fine in the case you describe. The main considerations are:
They only have 5 neighbors, so any code assuming six neighbors may be incorrect
They return 5 vertices from h3ToGeoBoundary, or 10 vertices for odd resolutions (to account for distortion at the icosahedron edges)
They have approximately 5/6 the area of the smallest hexagon cells.
If you encounter a pentagon when running kRing, you can no longer make any assumptions about the shape of the returned set - e.g. you may have fewer cells in a different arrangement than you would if you only encounter hexagons.
Some functions (hexRange, hexRing, and a few others) will fail with an error code if they encounter a pentagon. hexRange in particular is a fast kRing algorithm, which kRing uses under the hood, falling back to a slow-but-correct algo if a pentagon is encountered. You should only use hexRange if performance is very important and you're willing to handle failures. Functions that do not handle pentagons are generally clearly specified in the docs.

Related

Chess programming: minimax, detecting repeats, transposition tables

I'm building a database of chess evaluations (essentially a map from a chess position to an evaluation), and I want to use this to come up with a good move for given positions. The idea is to do a kind of "static" minimax, i.e.: for each position, use the stored evaluation if evaluations for child nodes (positions after next ply) are not available, otherwise use max (white to move)/min (black to move) evaluations of child nodes (which are determined in the same way).
The problem are, of course, loops in the graph, i.e. repeating positions. I can't fathom how to deal with this without making this infinitely less efficient.
The ideas I have explored so far are:
assume an evaluation of 0 for any position that can be reached in a game with less moves than are currently evaluated. This is an invalid assumption, because - for example - if White plays A, it might not be desirable for Black to follow up with x, but if White plays B, then y -> A -> x -> -B -> -y might be best line, resulting in the same position as A -> x, without any repetitions (-m denoting the inverse move to m here, lower case: Black moves, upper case: White moves).
having one instance for each possible way a position can be reached solves the loop problem, but this yields a bazillion of instances in some positions and is therefore not practical
the fact that there is a loop from a position back to that position doesn't mean that it's a draw by repetition, because playing the repeating line may not be best choice
I've tried iterating through the loops a few times to see if the overall evaluation would become stable. It doesn't, because in some cases, assuming the repeat is the best line means it isn't any longer - and then it goes back to the draw being the back line etc.
I know that chess engines use transposition tables to detect positions already reached before, but I believe this doesn't address my problem, and I actually wonder if there isn't an issue with them: a position may be reachable through two paths in the search tree - one of them going through the same position before, so it's a repeat, and the other path not doing that. Then the evaluation for path 1 would have to be 0, but the one for path 2 wouldn't necessarily be (path 1 may not be the best line), so whichever evaluation the transposition table holds may be wrong, right?
I feel sure this problem must have a "standard / best practice" solution, but google failed me. Any pointers / ideas would be very welcome!
I don't understand what the problem is. A minimax evaluation, unless we've added randomness to it, will have the exact same result for any given board position combined with who's turn it is and other key info. If we have the space available to store common board_position+who's_turn+castling+en passant+draw_related tuples (or hash thereof), go right ahead. When reaching that tuple in any other evaluation, just return the stored value or rely on its more detailed record for more complex evaluations (if the search yielding that record was not exhaustive, we can have different interpretations for it in any one evaluation). If the program also plays chess with time limits on the game, an additional time dimension (maybe a few broad blocks) would probably be needed in the memoisation as well.
(I assume you've read common public info about transposition tables.)

Ripleys K not plotting correctly?

I am doing point pattern analysis using the package spatstat and ran Ripley's K (spatstat::Kest) on my points to see if there is any clustering. However, it appears that not all the lines that should appear in the graph (kFem) have plotted. For example, the red line (Ktrans) stops at around x=12 and the green line (Kbord) doesn't appear at all. I would appreciate any insights as to how to interpret this and if there might be a bug.
Here is my study window. It is an irregular shape because I am analyzing a point pattern along a transect line.
And here is a density plot of my point pattern:
It is unlikely (but not impossible) that there is a simple bug in Kest that causes this, since this particular function has been tested intensively by many users. More likely you have a observation window that is irregular and there is a mathematical reason why the various estimates cannot be calculated at all distances. Please add a plot/summary of your point pattern so we have knowledge of the observation window (or even better give access to the observation window).
Furthermore, to manually inspect the estimates of the K-function you can convert the function value (fv) object to a data.frame and print it:
dat <- as.data.frame(kFem)
head(dat, n = 10)
Update:
Your window is indeed very irregular and the explanation of why it is not producing some corrections at large distances. I guess your transect is only a few metres wide and you are considering distances up to 50m. The border correction can only be calculated for distances up to something like the half width of the transect.
Using Kest implies that you believe that your transect is a subset of a big homogeneous point process (of equal intensity everywhere and with same correlation structure throughout space). If that is true then Kest provides a sensible estimate of the unknown true homogeneous K-function. However, you provide a plot where you have divided the region into sections of high, medium and low intensity which doesn't agree with the assumption of homogeneity. The deviation from the theoretical Poisson line may just be due to inhomogeneous intensity and not actual correlation between points. You should probably only consider distances that are much smaller than 50 (you can set rmax when you call Kest).

Procedural Maze Algorithm With Cells Determined Independently of Neighbors

I was thinking about maze algorithms recently (mostly because I'm working on a game, but I felt this is a more general question than game development related). In simple terms, I was wondering if there is a sort of maze algorithm that can generate (a possibly infinite number of) cells without any information specifically about the cell's neighbors. I imagine, if such a thing were possible, it would rely heavily upon noise functions such as Perlin or Simplex.
Each cell has four walls, these are used when actually rendering the maze so that corridors and walls are not the same thickness.
Let's say, for example, I'd like a cell at (32, 15) to generate its walls.
I know of algorithms like Ellers (which requires a limited number of columns, but infinite rows) and the Virtual fractal Mazes algorithm (which needs to know previous cells in order to build upon them infinitely in both x and y directions).
Does anyone know of any algorithm I could look into for this specific request? If not, are there any algorithms that are good for chunk-based mazes that you know of?
(Note: I did search around for a bit through StackOverflow to see if there were any questions with similar requests to mine, but I did not come across any. If you happen to know of one, a link would be greatly appreciated :D)
Thank you in advance.
Seeeeeecreeeets. My preeeeciooouss secretts. But yeah I can understand the frustration so I'll throw this one to you OP/SO. Feel free to update the PCG Wiki if you're not as lazy as me :3
There are actually many ways to do this. Some of the best techniques for procgen are:
Asking what you really want.
Design backwards. Play in reverse. Result is forwards.
Look at a random sampling of your target goal and try to see overall patterns.
But to get back to the question, there are two simple ways and they both start from asking what your really want. I'll give those first.
The first is to create 2 layers. Both are random noise. You connect the top and the bottom so they're fully connected. No isolated portions. This is asking what you really want which is connected-ness. And then guaranteeing it in a local clean-up step. (tbh I forget the function applied to layer 2 that guarantees connected-ness. I can't find the code atm.... But I remember it was a really simple local function... XOR, Curl, or something similar. Maybe you can figure it out before I fix this).
The second way is using the properties of your functions. As long as your random function is smooth enough you can take the gradient and assign a tile to it. The way you assign the tiles changes the maze structure but you can guarantee connectivity by clever selection of tiles for each gradient (b/c similar or opposite gradients are more likely to be near each other on a smooth gradient function). From here your smooth random can be any form of Perlin Noise, etc. Once again a asking what you want technique.
For backwards-reversed you unfortunately have an NP problem (I'm not sure if it's hard, complete, or whatever it's been a while since I've worked on this...). So while you could generate a random map of distances down a maze path. And then from there generate the actual obstacles... it's not really advisable. There's also a ton of consideration on different cases even for small mazes...
012
123
234
Is simple. There's a column in the lower right corner of 0 and the middle 2 has an _| shaped wall.
042
123
234
This one makes less sense. You still are required to have the same basic walls as before on all the non-changed squares... But you can't have that 4. It needs to be within 1 of at least a single neighbor. (I mean you could have a +3 cost for that square by having something like a conveyor belt or something, but then we're out of the maze problem) Okay so....
032
123
234
Makes more sense but the 2 in the corner is nonsense once again. Flipping that from a trough to a peak would give.
034
123
234
Which makes sense. At any rate. If you can get to this point then looking at local neighbors will give you walls if it's +/-1 then no wall. Otherwise wall. Also note that you can break the rules for the distance map in a consistent way and make a maze just fine. (Like instead of allowing a column picking a wall and throwing it up. This is just loop splitting at this point and should be safe)
For random sampling as the final one that I'm going to look at... Certain maze generation algorithms in the limit take on some interesting properties either as an average configuration or after millions of steps. Some form Voronoi regions. Some form concentric circles with a randomly flipped wall to allow a connection between loops. Etc. The loop one is good example to work off of. Create a set of loops. Flip a random wall on each loop. One will delete a wall which will create access to the next loop. One will split a path and offer a dead-end and a continuation. For a random flip to be a failure there has to be an opening and a split made right next to each other (unless you allow diagonals then we're good). So make loops. Generate random noise per loop. Xor together. Replace local failures with a fixed path if no diagonals are allowed.
So how do we get random noise per loop? Or how do we get better loops than just squares? Just take a random function. Separate divergence and now you have a loop map. If you have the differential equations for the source random function you can pick one random per loop. A simpler way might be to generate concentric circular walls and pick a random point at each radius to flip. Then distort the final result. You have to be careful your distortion doesn't violate any of your path-connected-ness conditions at that point though.

Fast method of tile fitting/arranging

For example, let's say we have a bounded 2D grid which we want to cover with square tiles of equal size. We have unlimited number of tiles that fall into a defined number of types. Each type of tile specifies the letters printed on that tile. Letters are printed next to each edge and only the tiles with matching letters on their adjacent edges can be placed next to one another on the grid. Tiles may be rotated.
Given the size of the grid and tile type definitions, what is the fastest method of arranging the tiles such that the above constraint is met and the entire/majority of the grid is covered? Note that my use case is for large grids (~20 in each dimension) and medium-large number of solutions (unlike Eternity II).
So far, I've tried DFS starting in the center and picking the locations around filled area that allow the least number of possibilities and backtrack in case no progress can be made. This only works for simple scenarios with one or two types. Any more and too much backtracking ensues.
Here's a trivial example, showing input and the final output:
This is a hard puzzle.
The Eternity 2 was a puzzle of this form with a 16 by 16 square grid.
Despite a 2 million dollar prize, no one found the solution in several years.
The paper "Jigsaw Puzzles, Edge Matching, and Polyomino Packing: Connections and Complexity" by Erik D. Demaine, Martin L. Demaine shows that this type of puzzle is NP-complete.
Given a problem of this sort with a square grid I would try a brute force list of all possible columns, and then a dynamic programming solution across all of the rows. This will find the number of solutions, and with a little extra work can be used to generate all of the solutions.
However if your rows are n long and there are m letters with k tiles, then the brute force list of all possible columns has up to mn possible right edges with up to m4k combinations/rotations of tiles needed to generate it. Then the transition from the right edge of one column to the right edge of the next next potentially has up to m2n possibilities in it. Those numbers are usually not worst case scenarios, but the size of those data structures will be the upper bound on the feasibility of that technique.
Of course if those data structures get too large to be feasible, then the recursive algorithm that you are describing will be too slow to enumerate all solutions. But if there are enough, it still might run acceptably fast even if this approach is infeasible.
Of course if you have more rows than columns, you'll want to reverse the role of rows and columns in this technique.

Simple k-nearest-neighbor algorithm for euclidean data with variable density?

An elaboration on this question, but with more constraints.
The idea is the same, to find a simple, fast algorithm for k-nearest-neighbors in 2 euclidean dimensions. The bucketing grid seems to work nicely if you can find a grid size that will suitably partition your data. However, what if the data is not uniformly distributed, but has areas with both very high and very low density (for example, the US population), so that no fixed grid size could guarantee both enough neighbors and efficiency? Can this method still be salvaged?
If not, other suggestions would be helpful, though I hope for answers less complex than moving to kd-trees, etc.
If you don't have too many elements, just compare each with all the others. This can be a lot faster than you'd think; today's machines are fast. Unfortunately, the square factor will catch you sooner or later; I figure a linear search of a million objects won't take tooo long, so you may be okay with up to 1000 elements. Using a grid, or even stripes, might boost that number substantially.
But I think you're stuck with a quadtree (a specific form of k-d tree). Your whole map is one block, which can contain four subblocks (upper left, upper right, lower left, lower right). When a block fills up with more elements than you want to do a linear search on, break it into smaller ones and transfer the elements. (Only leaf nodes have elements.) It's easy to search within a given radius of a given point. Start at the top and if a part of a block is within range of the point, check out it's subblocks the same way if it has them. If it doesn't, check its elements.
(When searching for "closest", take care. The square grid means a nearer object might be in a farther block. You have to get everything within a given radius, then check 'em all. If you want the 10 closest and your radius of 20 only picked up 5, you need to try a larger radius. You may have a rejected item that proved to be 30 away and think you should grab it and a few others to make up your 10. However, there may be a few items at 25 away whose whole blocks were rejected, and you want them instead. There ought to be a better solution for this, but I haven't figured it out yet. I just make a guess at the radius and double it till I get enough.)
Quadtrees are fun. If you can set up your data and then access it, it's easy. The problems come when your mapped elements appear, disappear, and move while you are trying to figure out who's near what.
Have you looked at this?
http://www.cs.sunysb.edu/~algorith/major_section/1.4.shtml
kd-trees are quite simple to implement, there are standard java/c implementations.
Also:
You may want to post your question here:
https://cstheory.stackexchange.com/?as=1

Resources