Does initial direction matters building a KdTree - algorithm

For example all samples are very close to x axis, split first in which direction seems not affect anything, there will be a split in the other direction anyway, so after 2 iteration in NNS I think we should expect the information filtered almost the same, is that correct?

Related

Chess programming: minimax, detecting repeats, transposition tables

I'm building a database of chess evaluations (essentially a map from a chess position to an evaluation), and I want to use this to come up with a good move for given positions. The idea is to do a kind of "static" minimax, i.e.: for each position, use the stored evaluation if evaluations for child nodes (positions after next ply) are not available, otherwise use max (white to move)/min (black to move) evaluations of child nodes (which are determined in the same way).
The problem are, of course, loops in the graph, i.e. repeating positions. I can't fathom how to deal with this without making this infinitely less efficient.
The ideas I have explored so far are:
assume an evaluation of 0 for any position that can be reached in a game with less moves than are currently evaluated. This is an invalid assumption, because - for example - if White plays A, it might not be desirable for Black to follow up with x, but if White plays B, then y -> A -> x -> -B -> -y might be best line, resulting in the same position as A -> x, without any repetitions (-m denoting the inverse move to m here, lower case: Black moves, upper case: White moves).
having one instance for each possible way a position can be reached solves the loop problem, but this yields a bazillion of instances in some positions and is therefore not practical
the fact that there is a loop from a position back to that position doesn't mean that it's a draw by repetition, because playing the repeating line may not be best choice
I've tried iterating through the loops a few times to see if the overall evaluation would become stable. It doesn't, because in some cases, assuming the repeat is the best line means it isn't any longer - and then it goes back to the draw being the back line etc.
I know that chess engines use transposition tables to detect positions already reached before, but I believe this doesn't address my problem, and I actually wonder if there isn't an issue with them: a position may be reachable through two paths in the search tree - one of them going through the same position before, so it's a repeat, and the other path not doing that. Then the evaluation for path 1 would have to be 0, but the one for path 2 wouldn't necessarily be (path 1 may not be the best line), so whichever evaluation the transposition table holds may be wrong, right?
I feel sure this problem must have a "standard / best practice" solution, but google failed me. Any pointers / ideas would be very welcome!
I don't understand what the problem is. A minimax evaluation, unless we've added randomness to it, will have the exact same result for any given board position combined with who's turn it is and other key info. If we have the space available to store common board_position+who's_turn+castling+en passant+draw_related tuples (or hash thereof), go right ahead. When reaching that tuple in any other evaluation, just return the stored value or rely on its more detailed record for more complex evaluations (if the search yielding that record was not exhaustive, we can have different interpretations for it in any one evaluation). If the program also plays chess with time limits on the game, an additional time dimension (maybe a few broad blocks) would probably be needed in the memoisation as well.
(I assume you've read common public info about transposition tables.)

Procedural Maze Algorithm With Cells Determined Independently of Neighbors

I was thinking about maze algorithms recently (mostly because I'm working on a game, but I felt this is a more general question than game development related). In simple terms, I was wondering if there is a sort of maze algorithm that can generate (a possibly infinite number of) cells without any information specifically about the cell's neighbors. I imagine, if such a thing were possible, it would rely heavily upon noise functions such as Perlin or Simplex.
Each cell has four walls, these are used when actually rendering the maze so that corridors and walls are not the same thickness.
Let's say, for example, I'd like a cell at (32, 15) to generate its walls.
I know of algorithms like Ellers (which requires a limited number of columns, but infinite rows) and the Virtual fractal Mazes algorithm (which needs to know previous cells in order to build upon them infinitely in both x and y directions).
Does anyone know of any algorithm I could look into for this specific request? If not, are there any algorithms that are good for chunk-based mazes that you know of?
(Note: I did search around for a bit through StackOverflow to see if there were any questions with similar requests to mine, but I did not come across any. If you happen to know of one, a link would be greatly appreciated :D)
Thank you in advance.
Seeeeeecreeeets. My preeeeciooouss secretts. But yeah I can understand the frustration so I'll throw this one to you OP/SO. Feel free to update the PCG Wiki if you're not as lazy as me :3
There are actually many ways to do this. Some of the best techniques for procgen are:
Asking what you really want.
Design backwards. Play in reverse. Result is forwards.
Look at a random sampling of your target goal and try to see overall patterns.
But to get back to the question, there are two simple ways and they both start from asking what your really want. I'll give those first.
The first is to create 2 layers. Both are random noise. You connect the top and the bottom so they're fully connected. No isolated portions. This is asking what you really want which is connected-ness. And then guaranteeing it in a local clean-up step. (tbh I forget the function applied to layer 2 that guarantees connected-ness. I can't find the code atm.... But I remember it was a really simple local function... XOR, Curl, or something similar. Maybe you can figure it out before I fix this).
The second way is using the properties of your functions. As long as your random function is smooth enough you can take the gradient and assign a tile to it. The way you assign the tiles changes the maze structure but you can guarantee connectivity by clever selection of tiles for each gradient (b/c similar or opposite gradients are more likely to be near each other on a smooth gradient function). From here your smooth random can be any form of Perlin Noise, etc. Once again a asking what you want technique.
For backwards-reversed you unfortunately have an NP problem (I'm not sure if it's hard, complete, or whatever it's been a while since I've worked on this...). So while you could generate a random map of distances down a maze path. And then from there generate the actual obstacles... it's not really advisable. There's also a ton of consideration on different cases even for small mazes...
012
123
234
Is simple. There's a column in the lower right corner of 0 and the middle 2 has an _| shaped wall.
042
123
234
This one makes less sense. You still are required to have the same basic walls as before on all the non-changed squares... But you can't have that 4. It needs to be within 1 of at least a single neighbor. (I mean you could have a +3 cost for that square by having something like a conveyor belt or something, but then we're out of the maze problem) Okay so....
032
123
234
Makes more sense but the 2 in the corner is nonsense once again. Flipping that from a trough to a peak would give.
034
123
234
Which makes sense. At any rate. If you can get to this point then looking at local neighbors will give you walls if it's +/-1 then no wall. Otherwise wall. Also note that you can break the rules for the distance map in a consistent way and make a maze just fine. (Like instead of allowing a column picking a wall and throwing it up. This is just loop splitting at this point and should be safe)
For random sampling as the final one that I'm going to look at... Certain maze generation algorithms in the limit take on some interesting properties either as an average configuration or after millions of steps. Some form Voronoi regions. Some form concentric circles with a randomly flipped wall to allow a connection between loops. Etc. The loop one is good example to work off of. Create a set of loops. Flip a random wall on each loop. One will delete a wall which will create access to the next loop. One will split a path and offer a dead-end and a continuation. For a random flip to be a failure there has to be an opening and a split made right next to each other (unless you allow diagonals then we're good). So make loops. Generate random noise per loop. Xor together. Replace local failures with a fixed path if no diagonals are allowed.
So how do we get random noise per loop? Or how do we get better loops than just squares? Just take a random function. Separate divergence and now you have a loop map. If you have the differential equations for the source random function you can pick one random per loop. A simpler way might be to generate concentric circular walls and pick a random point at each radius to flip. Then distort the final result. You have to be careful your distortion doesn't violate any of your path-connected-ness conditions at that point though.

Shuffle polyominos in a grid

I have a 2d array that contains polyominos of lengths 1 to 4, i want to shuffle their position AND rotation.
For example:
[0,0,1,0,0,0,0,0,0,0]
[0,0,1,0,0,0,0,0,0,0]
[0,0,1,1,0,0,0,0,0,0]
[0,0,0,0,0,0,0,0,0,0]
[0,0,X,0,0,0,0,3,0,0]
[0,0,0,0,0,0,0,0,0,0]
[0,0,0,2,2,0,0,0,0,0]
[0,0,0,2,0,0,0,X,0,0]
[0,0,0,0,0,0,0,0,0,0]
[0,0,0,0,0,0,0,0,0,0]
First, 0s are empty spaces, and numbers are polyominos. There are four of them: one tetromino "L" made by 1s, one tromino "L" made of 2s, and one monomino made by one 3. I want to shuffle this grid, changing their position and rotation without loosing their structure.
Second, though 0s are available spaces, Xs aren't and cant be shuffled.
I'd be happy if someone can help me solve the first part of the problem, but if there is anyone smart (or bored) enough to solve also the second part, it'll really appreciated because this stuff gets way over my head.
Edit:
The problem can also be seen as placing N polyominos in a 2d grid of fixed size.
I think that rather than having a board the way you have it, you should actually store a list of pieces, and their rotation and position on your board. The way you are storing it right now, it is much more difficult to manipulate because you have essentially finalized the data onto the board as a single data set.
If you like, you can store the board with a set of booleans (simply to take up less space in memory) where a true correlates to your "X" and a false correlates to an un-utilized space.
Then I would randomly choose a rotation for each piece before placement.
After you know the finalized randomized rotation, you can move through your LIST of pieces, generate a list of all placement locations possible, and simply choose one of those locations randomly. You will have to generate this potential placement list separately for each piece, as once a piece is placed, you will not be able to place another piece on top of it!
The trick here is to separate your board data from your pieces data so that you can manipulate the pieces more easily.

Simple k-nearest-neighbor algorithm for euclidean data with variable density?

An elaboration on this question, but with more constraints.
The idea is the same, to find a simple, fast algorithm for k-nearest-neighbors in 2 euclidean dimensions. The bucketing grid seems to work nicely if you can find a grid size that will suitably partition your data. However, what if the data is not uniformly distributed, but has areas with both very high and very low density (for example, the US population), so that no fixed grid size could guarantee both enough neighbors and efficiency? Can this method still be salvaged?
If not, other suggestions would be helpful, though I hope for answers less complex than moving to kd-trees, etc.
If you don't have too many elements, just compare each with all the others. This can be a lot faster than you'd think; today's machines are fast. Unfortunately, the square factor will catch you sooner or later; I figure a linear search of a million objects won't take tooo long, so you may be okay with up to 1000 elements. Using a grid, or even stripes, might boost that number substantially.
But I think you're stuck with a quadtree (a specific form of k-d tree). Your whole map is one block, which can contain four subblocks (upper left, upper right, lower left, lower right). When a block fills up with more elements than you want to do a linear search on, break it into smaller ones and transfer the elements. (Only leaf nodes have elements.) It's easy to search within a given radius of a given point. Start at the top and if a part of a block is within range of the point, check out it's subblocks the same way if it has them. If it doesn't, check its elements.
(When searching for "closest", take care. The square grid means a nearer object might be in a farther block. You have to get everything within a given radius, then check 'em all. If you want the 10 closest and your radius of 20 only picked up 5, you need to try a larger radius. You may have a rejected item that proved to be 30 away and think you should grab it and a few others to make up your 10. However, there may be a few items at 25 away whose whole blocks were rejected, and you want them instead. There ought to be a better solution for this, but I haven't figured it out yet. I just make a guess at the radius and double it till I get enough.)
Quadtrees are fun. If you can set up your data and then access it, it's easy. The problems come when your mapped elements appear, disappear, and move while you are trying to figure out who's near what.
Have you looked at this?
http://www.cs.sunysb.edu/~algorith/major_section/1.4.shtml
kd-trees are quite simple to implement, there are standard java/c implementations.
Also:
You may want to post your question here:
https://cstheory.stackexchange.com/?as=1

Is there an efficient algorithm to generate random points in general position in the plane?

I need to generate n random points in general position in the plane, i.e. no three points can lie on a same line. Points should have coordinates that are integers and lie inside a fixed square m x m. What would be the best algorithm to solve such a problem?
Update: square is aligned with the axes.
Since they're integers within a square, treat them as points in a bitmap. When you add a point after the first, use Bresenham's algorithm to paint all pixels on each of the lines going through the new point and one of the old ones. When you need to add a new point, get a random location and check if it's clear; otherwise, try again. Since each pair of pixels gives a new line, and thus excludes up to m-2 other pixels, as the number of points grows you will have several random choices rejected before you find a good one. The advantage of the approach I'm suggesting is that you only pay the cost of going through all lines when you have a good choice, while rejecting a bad one is a very quick test.
(if you want to use a different definition of line, just replace Bresenham's with the appropriate algorithm)
Can't see any way around checking each point as you add it, either by (a) running through all of the possible lines it could be on, or (b) eliminating conflicting points as you go along to reduce the possible locations for the next point. Of the two, (b) seems like it could give you better performance.
Similar to #LaC's answer. If memory is not a problem, you could do it like this:
Add all points on the plane to a list (L).
Shuffle the list.
For each point (P) in the list,
For each point (Q) previously picked,
Remove every point from L which are linear to P-Q.
Add P to the picked list.
You could continue the outer loop until you have enough points, or run out of them.
This might just work (though might be a little constrained on being random). Find the largest circle you can draw within the square (this seems very doable). Pick any n points on the circle, no three will ever be collinear :-).
This should be an easy enough task in code. Say the circle is centered at origin (so something of the form x^2 + y^2 = r^2). Assuming r is fixed and x randomly generated, you can solve to find y coordinates. This gives you two points on the circle for every x which are diametrically opposite. Hope this helps.
Edit: Oh, integer points, just noticed that. Thats a pity. I'm going to keep this solution up though - since I like the idea
Both #LaC's and #MizardX's solution are very interesting, but you can combine them to get even better solution.
The problem with #LaC's solution is that you get random choices rejected. The more points you have already generated the harder it gets to generate new ones. If there is only one available position left you have slight chance of randomly choosing it (1/(n*m)).
In the #MizardX's solution you never get rejected choices, however if you directly implement the "Remove every point from L which are linear to P-Q." step you'll get worse complexity (O(n^5)).
Instead it would be better to use a bitmap to find which points from L are to be removed. The bitmap would contain a value indicating whether a point is free to use and what is its location on the L list or a value indicating that this point is already crossed out. This way you get worst-case complexity of O(n^4) which is probably optimal.
EDIT:
I've just found that question: Generate Non-Degenerate Point Set in 2D - C++
It's very similar to this one. It would be good to use solution from this answer Generate Non-Degenerate Point Set in 2D - C++. Modifying it a bit to use radix or bucket sort and adding all the n^2 possible points to the P set initially and shufflying it, one can also get worst-case complexity of O(n^4) with a much simpler code. Moreover, if space is a problem and #LaC's solution is not feasible due to space requirements, then this algorithm will just fit in without modifications and offer a decent complexity.
Here is a paper that can maybe solve your problem:
"POINT-SETS IN GENERAL POSITION WITH MANY
SIMILAR COPIES OF A PATTERN"
by BERNARDO M. ABREGO AND SILVIA FERNANDEZ-MERCHANT
um, you don't specify which plane.. but just generate 3 random numbers and assign to x,y, and z
if 'the plane' is arbitrary, then set z=o every time or something...
do a check on x and y to see if they are in your m boundary,
compare the third x,y pair to see if it is on the same line as the first two... if it is, then regenerate the random values.

Resources