Most efficient way to get a tree object's root node - performance

I am interested in some design pattern that will give me very fast access from any tree node to the tree root with as little CPU and memory overhead as possible.
Now, to iterate the obvious solutions, and explain why they are inapplicable:
go down the tree until the root is found - which is what I am currently doing, but as the tree can get quite deep and root access is pretty frequent, that is not an ideal solution, even if mitigating the overhead by caching and reusing the root as much as possible
store the root node pointer in every node - very fast, but adds a significant memory overhead, as the tree is quite big
Now, if there was only one tree, I could use a singleton, a static class member or any similar approach, however the application has more than one tree, each representing an open project.
Before resorting to searching the root, my design involved only one single active project at a time, so I could presume that if any user activity is occurring, it would be targeting the active project, which was a very efficient way to get the particular object's root item.
But now I have a multiple windows application, where each window can have a different active project, and also inter-project communication that can trigger activity inside a project that is not currently active in any editor window. So it is no longer possible to use the "active project pointer" approach.
It seems that I have exhausted all options, but there might be some fancy design pattern which could give me a good compromise?

I went for a solution based on segregating the purpose of different tree nodes.
Most nodes are simply content, but there are also container nodes, which are fairly evenly distributed across the tree.
Container nodes store the tree root, and since they are few and far in between, the memory overhead is minimized. So is the lookup time, as I no longer have to traverse the entire tree until I get to the root, but only down to the nearest container node.
So in a typical usage scenario, this reduces lookup time by 2-3 orders of magnitude, by adding about 1-2% of memory overhead. A more than fair compromise.

Related

How is Monte Carlo Tree Search implemented in practice

I understand, to a certain degree, how the algorithm works. What I don't fully understand is how the algorithm is actually implemented in practice.
I'm interested in understanding what optimal approaches would be for a fairly complex game (maybe chess). i.e. recursive approach? async? concurrent? parallel? distributed? data structures and/or database(s)?
-- What type of limits would we expect to see on a single machine? (could we run concurrently across many cores... gpu maybe?)
-- If each branch results in a completely new game being played, (this could reach the millions) how do we keep the overall system stable? & how can we reuse branches already played?
recursive approach? async? concurrent? parallel? distributed? data structures and/or database(s)
In MCTS, there's not much of a point in a recursive implementation (which is common in other tree search algorithms like the minimax-based ones), because you always go "through" a game in sequences from current game state (root node) till game states you choose to evaluate (terminal game states, unless you choose to go with a non-standard implementation using a depth limit on the play-out phase and a heuristic evaluation function). The much more obvious implementation using while loops is just fine.
If it's your first time implementing the algorithm, I'd recommend just going for a single-threaded implementation first. It is a relatively easy algorithm to parallelize though, there are multiple papers on that. You can simply run multiple simulations (where simulation = selection + expansion + playout + backpropagation) in parallel. You can try to make sure everything gets updated cleanly during backpropagation, but you can also simply decide to not use any locks / blocking etc. at all, there's already enough randomness in all the simulations anyway so if you lose information from a couple of simulations here and there due to naively-implemented parallelization it really doesn't hurt too much.
As for data structures, unlike algorithms like minimax, you actually do need to explicitly build a tree and store it in memory (it is built up gradually as the algorithm is running). So, you'll want a general tree data structure with Nodes that have a list of successor / child Nodes, and also a pointer back to the parent Node (required for backpropagation of simulation outcomes).
What type of limits would we expect to see on a single machine? (could we run concurrently across many cores... gpu maybe?)
Running across many cores can be done yes (see point about parallelization above). I don't see any part of the algorithm being particularly well-suited for GPU implementations (there are no large matrix multiplications or anything like that), so GPU is unlikely to be interesting.
If each branch results in a completely new game being played, (this could reach the millions) how do we keep the overall system stable? & how can we reuse branches already played?
In the most commonly-described implementation, the algorithm creates only one new node to store in memory per iteration/simulation in the expansion phase (the first node encountered after the Selection phase). All other game states generated in the play-out phase of the same simulation do not get any nodes to store in memory at all. This keeps memory usage in check, it means your tree only grows relatively slowly (at a rate of 1 node per simulation). It does mean you get slightly less re-usage of previously-simulated branches, because you don't store everything you see in memory. You can choose to implement a different strategy for the expansion phase (for example, create new nodes for all game states generated in the play-out phase). You'll have to carefully monitor memory usage if you do this though.

Suitable tree data structure

Which is the most suitable tree data structure to model a hierarchical (containment relationship) content. My language is bit informal as I don't have much theoretical background on these
Parent node can have multiple children.
Unique parent
Tree structure is rarely changed, os it is ok to recreate than add/rearrange nodes.
Two way traversal
mainly interested in, find parent, find children, find a node with a unique id
Every node has a unique id
There might be only hundreds of nodes in total, so performance may not be big influence
Persistence may be good to have, but not necessary as I plan to use it in memory after reading the data from DB.
My language of choice is go (golang), so available libraries are limited. Please give a recommendation without considering the language which best fit the above requirement.
http://godashboard.appspot.com/ lists some of the available tree libraries. Not sure about the quality and how active they are. I read god about
https://github.com/petar/GoLLRB
http://www.stathat.com/src/treap
Please let know any additional information required.
Since there are only hundreds of nodes, just make the structure the same as you have described.
Each node has a unique reference to parent node.
Each node has a list of child node.
Each node has an id
There is a (external) map from id --> node. May not even necessary.
2 way traversal is possible, since parent and child nodes are know. Same for find parent and find child.
Find id can be done by traverse the whole tree, if no map. Or you can use the map to quickly find the node.
Add node is easy, since there is a list in each node. Rearrange is also easy, since you can just freely add/remove from list of child nodes and reassign the parent node.
I'm answering this question from a language-agnostic aspect. This is a tree structure without any limit, so implementation is not that popular.
I think B-Tree is way to go for your requirements. http://en.wikipedia.org/wiki/B-tree
Point 1,2,3: B-Tree inherently support this. (multiple children, unique parent, allows insertion/deletion of elements
Point 4,5: each node will have pointers for its child by default implementation . Additionally you can maintain pointer of parent for each node. you can implement your search/travers operations with BFS/DFS with help of these pointers
Pomit 6: depends on implementation of your insert method if you dont allow duplicate records
Pont 7,8: not a issue as for you have mentioned that you have only hundreds of records. Though B-Trees are quite good data structure for external disk storage also.

What are pro/cons of push/pull data flow models?

I've been developing an in-house DSP application (Java w/ hooks for Groovy/Jython/JRuby, plugins via OSGi, plenty of JNI to go around) in data flow/diagram style, similar to pure data and simulink. My current design is a push model. The user interacts with some source component causing it to push data onto the next component and so on until a end block (typically a display or file writer). There are some unique challenges with this design specifically when a component starves for input. There is no easy way to request more input. I have mitiated some of this with feedback control flow, ex an FFT block can broadcast that it needs more data to source block of it's chain. I've contemplated adding support for components to be either push/pull/both.
I'm looking for responses regarding the merits of push vs pull vs both/hybrid. Have you done this before? What are some of the "gotchas"? How did you handle them? Is there a better solution for this problem?
Some experience with a "mostly-pull" approach in a large-scale product:
Model: Nodes build a 1:N tree, i.e. each component (except the root) has 1 parent and 1..N children. Data flows almost exclusively from parent to children. Change notifications can originate from any node in the tree.
Implementation: All leafs are notified with the sending node's id and a "generation" counter. Leafs know which node path they depend on, so they know if they need to update. (Any other child node update algorithm would do, too, and might have been better in hindsight).
Leafs query their parent for current data, query bubbles up recursively. The generation counter is included, so the bubble-up stops at the originating node.
Advantages:
parent nodes don't need much/any information about their children. Data can be consumed by anyone - this allowed a generic approach to implementing some (initially not expected) non-UI functionality on top of the data intended for display
Child nodes can aggregate and delay updates (avoiding repaints sure beats fast painting)
inactive leafs do cause no data traffic at all
Disadvantages:
Incremental updates are expensive, as full data is published.
The implementation actually allows for different data packets to be requested (and
the generation counter could prevent unecessary data traffic), but the data packets initially designed are very large. Slicing them was an afterthought, but works ok.
You need a real good generation mechanism. The one initially implemented collided with initial updates (that need special handling - see "incremental updates") and aggregation of updates
the need for data travelling up the tree was greatly underestimated.
publish is cheap only when the node offers read-only access to current data. This might require additional update synchronization, though
sometimes you want intermediate nodes to update, even when all leafs are inactive
some leafs ended up implementing polling, some base nodes ended up relying on that. ugly.
Generally:
Data-Pull "feels" more native to me when data and processing layer should know nothing about the UI. However, it requires a complex change notificatin mechanism to avoid "Updating the universe".
Data-Push simplifies incremental updates, but only if the sender intimately knows the receiver.
I have no experience of similar scale using other models, so I can't really make a recommendation. Looking back, I see that I've mostly used pull, which was less of a hassle. It would be interesting to see other peoples experiences.
I work on a pure-pull image processing library. It's more geared to batch-style operations where we don't have to deal with dynamic inputs and for that it seems to work very well. Pull works especially well for large data sets and for threading: we scale linearly to at least 32 CPUs (depending on the graph being evaluated, of course, heh).
We have a GUI that allows leaves to be dynamic data sources (for example, a video camera delivering frames) and they are handled by throwing away and rebuilding the relevant parts of the graph on a change. This is cheap in our case, so the overhead isn't that high.

How to subdivide a 2d game world for better collision detection

I'm developing a game which features a sizeable square 2d playing area. The gaming area is tileless with bounded sides (no wrapping around). I am trying to figure out how I can best divide up this world to increase the performance of collision detection. Rather than checking each entity for collision with all other entities I want to only check nearby entities for collision and obstacle avoidance.
I have a few special concerns for this game world...
I want to be able to be able to use a large number of entities in the game world at once. However, a % of entities won't collide with entities of the same type. For example projectiles won't collide with other projectiles.
I want to be able to use a large range of entity sizes. I want there to be a very large size difference between the smallest entities and the largest.
There are very few static or non-moving entities in the game world.
I'm interested in using something similar to what's described in the answer here: Quadtree vs Red-Black tree for a game in C++?
My concern is how well will a tree subdivision of the world be able to handle large size differences in entities? To divide the world up enough for the smaller entities the larger ones will need to occupy a large number of regions and I'm concerned about how that will affect the performance of the system.
My other major concern is how to properly keep the list of occupied areas up to date. Since there's a lot of moving entities, and some very large ones, it seems like dividing the world up will create a significant amount of overhead for keeping track of which entities occupy which regions.
I'm mostly looking for any good algorithms or ideas that will help reduce the number collision detection and obstacle avoidance calculations.
If I were you I'd start off by implementing a simple BSP (binary space partition) tree. Since you are working in 2D, bound box checks are really fast. You basically need three classes: CBspTree, CBspNode and CBspCut (not really needed)
CBspTree has one root node instance of class CBspNode
CBspNode has an instance of CBspCut
CBspCut symbolize how you cut a set in two disjoint sets. This can neatly be solved by introducing polymorphism (e.g. CBspCutX or CBspCutY or some other cutting line). CBspCut also has two CBspNode
The interface towards the divided world will be through the tree class and it can be a really good idea to create one more layer on top of that, in case you would like to replace the BSP solution with e.g. a quad tree. Once you're getting the hang of it. But in my experience, a BSP will do just fine.
There are different strategies of how to store your items in the tree. What I mean by that is that you can choose to have e.g. some kind of container in each node that contains references to the objects occuping that area. This means though (as you are asking yourself) that large items will occupy many leaves, i.e. there will be many references to large objects and very small items will show up at single leaves.
In my experience this doesn't have that large impact. Of course it matters, but you'd have to do some testing to check if it's really an issue or not. You would be able to get around this by simply leaving those items at branched nodes in the tree, i.e. you will not store them on "leaf level". This means you will find those objects quick while traversing down the tree.
When it comes to your first question. If you only are going to use this subdivision for collision testing and nothing else, I suggest that things that can never collide never are inserted into the tree. A missile for example as you say, can't collide with another missile. Which would mean that you dont even have to store the missile in the tree.
However, you might want to use the bsp for other things as well, you didn't specify that but keep that in mind (for picking objects with e.g. the mouse). Otherwise I propose that you store everything in the bsp, and resolve the collision later on. Just ask the bsp of a list of objects in a certain area to get a limited set of possible collision candidates and perform the check after that (assuming objects know what they can collide with, or some other external mechanism).
If you want to speed up things, you also need to take care of merge and split, i.e. when things are removed from the tree, a lot of nodes will become empty or the number of items below some node level will decrease below some merge threshold. Then you want to merge two subtrees into one node containing all items. Splitting happens when you insert items into the world. So when the number of items exceed some splitting threshold you introduce a new cut, which splits the world in two. These merge and split thresholds should be two constants that you can use to tune the efficiency of the tree.
Merge and split are mainly used to keep the tree balanced and to make sure that it works as efficient as it can according to its specifications. This is really what you need to worry about. Moving things from one location and thus updating the tree is imo fast. But when it comes to merging and splitting it might become expensive if you do it too often.
This can be avoided by introducing some kind of lazy merge and split system, i.e. you have some kind of dirty flagging or modify count. Batch up all operations that can be batched, i.e. moving 10 objects and inserting 5 might be one batch. Once that batch of operations is finished, you check if the tree is dirty and then you do the needed merge and/or split operations.
Post some comments if you want me to explain further.
Cheers !
Edit
There are many things that can be optimized in the tree. But as you know, premature optimization is the root to all evil. So start off simple. For example, you might create some generic callback system that you can use while traversing the tree. This way you dont have to query the tree to get a list of objects that matched the bound box "question", instead you can just traverse down the tree and execute that call back each time you hit something. "If this bound box I'm providing intersects you, then execute this callback with these parameters"
You most definitely want to check this list of collision detection resources from gamedev.net out. It's full of resources with game development conventions.
For other than collision detection only, check their entire list of articles and resources.
My concern is how well will a tree
subdivision of the world be able to
handle large size differences in
entities? To divide the world up
enough for the smaller entities the
larger ones will need to occupy a
large number of regions and I'm
concerned about how that will affect
the performance of the system.
Use a quad tree. For objects that exist in multiple areas you have a few options:
Store the object in both branches, all the way down. Everything ends up in leaf nodes but you may end up with a significant number of extra pointers. May be appropriate for static things.
Split the object on the zone border and insert each part in their respective locations. Creates a lot of pain and isn't well defined for a lot of objects.
Store the object at the lowest point in the tree you can. Sets of objects now exist in leaf and non-leaf nodes, but each object has one pointer to it in the tree. Probably best for objects that are going to move.
By the way, the reason you're using a quad tree is because it's really really easy to work with. You don't have any heuristic based creation like you might with some BSP implementations. It's simple and it gets the job done.
My other major concern is how to
properly keep the list of occupied
areas up to date. Since there's a lot
of moving entities, and some very
large ones, it seems like dividing the
world up will create a significant
amount of overhead for keeping track
of which entities occupy which
regions.
There will be overhead to keeping your entities in the correct spots in the tree every time they move, yes, and it can be significant. But the whole point is that you're doing much much less work in your collision code. Even though you're adding some overhead with the tree traversal and update it should be much smaller than the overhead you just removed by using the tree at all.
Obviously depending on the number of objects, size of game world, etc etc the trade off might not be worth it. Usually it turns out to be a win, but it's hard to know without doing it.
There are lots of approaches. I'd recommend settings some specific goals (e.g., x collision tests per second with a ratio of y between smallest to largest entities), and do some prototyping to find the simplest approach that achieves those goals. You might be surprised how little work you have to do to get what you need. (Or it might be a ton of work, depending on your particulars.)
Many acceleration structures (e.g., a good BSP) can take a while to set up and thus are generally inappropriate for rapid animation.
There's a lot of literature out there on this topic, so spend some time searching and researching to come up with a list candidate approaches. Mock them up and profile.
I'd be tempted just to overlay a coarse grid over the play area to form a 2D hash. If the grid is at least the size of the largest entity then you only ever have 9 grid squares to check for collisions and it's a lot simpler than managing quad-trees or arbitrary BSP trees. The overhead of determining which coarse grid square you're in is typically just 2 arithmetic operations and when a change is detected the grid just has to remove one reference/ID/pointer from one square's list and add the same to another square.
Further gains can be had from keeping the projectiles out of the grid/tree/etc lookup system - since you can quickly determine where the projectile would be in the grid, you know which grid squares to query for potential collidees. If you check collisions against the environment for each projectile in turn, there's no need for the other entities to then check for collisions against the projectiles in reverse.

Store hierarchies in a way that is resistant to corruption

I was thinking today about the best way to store a hierarchical set of nodes, e.g.
(source: www2002.org)
The most obvious way to represent this (to me at least) would be for each node to have nextSibling and childNode pointers, either of which could be null.
This has the following properties:
Requires a small number of changes if you want to add in or remove a node somewhere
Is highly susceptible to corruption. If one node was lost, you could potentially lose a large amount of other nodes that were dependent on being found through that node's pointers.
Another method you might use is to come up with a system of coordinates, e.g. 1.1, 1.2, 1.2.1, 1.2.2. 1.2.3 would be the 3rd node at the 3rd level, with the 2nd node at the prior level as its parent. Unanticipated loss of a node would not affect the ability to resolve any other nodes. However, adding in a node somewhere has the potential effect of changing the coordinates for a large number of other nodes.
What are ways that you could store a hierarchy of nodes that requires few changes to add or delete a node and is resilient to corruption of a few nodes? (not implementation-specific)
When you refer to corruption, are you talking about RAM or some other storage? Perhaps during transmission over some medium?
In any case, when you are dealing with data corruption you are talking about an entire field of computer science that deals with error detection and correction.
When you talk about losing a node, the first thing you have to figure out is 'how do I know I've lost a node?', that's error detection.
As for the problem of protecting data from corruption, pretty much the only way to do this is with redundancy. The amount of redundancy is determined by what limit you want to put on the degree of corruption you would like to be able to recover from. You can't really protect yourself from this with a clever structure design as you are just as likely to suffer corruption to the critical 'clever' part of your structure :)
The ever-wise wikipedia is a good place to start: Error detection and correction
I was thinking today about the best way to store a hierarchical set of nodes
So you are writing a filesystem? ;-)
The most obvious way to represent this (to me at least) would be for each node to have nextSibling and childNode pointers
Why? The sibling information is present at the parent node, so all you need is a pointer back to the parent. A doubly linked-list, so to speak.
What are ways that you could store a hierarchy of nodes that requires few changes to add or delete a node and is resilient to corruption of a few nodes?
There are actually two different questions involved here.
Is the data corrupt?
How do I fix corrupt data (aka self healing systems)?
Answers to these two questions will determine the exact nature of the solution.
Data Corruption
If your only aim is to know if your data is good or not, store a hash digest of child node information with the parent.
Self Healing Structures
Any self healing structure will need the following information:
Is there a corruption? (See above)
Where is the corruption?
Can it be healed?
Different algorithms exist to fix data with varying degree of effectiveness. The root idea is to introduce redundancy. Reconstruction depends on your degree of redundancy. Since the best guarantees are given by the most redundant systems -- you'll have to choose.
I believe there is some scope of narrowing down your question to a point where we can start discussing individual bits and pieces of the puzzle.
A simple option is to store reference to root node in every node - this way it is easy to detect orphan nodes.
Another interesting option is to store hierarchy information as a descendants (transitive closure) table. So for 1.2.3 node you'd have the following relations:
1., 1.2.3. - root node is ascendant of 1.2.3.
1.2., 1.2.3. - 1.2. node is ascendant of 1.2.3.
1., 1.2. - root node is ascendant of 1.2.
etc...
This table can be more resistant to errors as it holds some redundant info.
Goran
The typical method to store a hierarchy, is to have a ParentNode property/field in every node. For root the ParentNode is null, for all other nodes it has a value. This does mean that the tree may lose entire branches, but in memory that seems unlikely, and in a DB you can guard against that using constraints.
This approach doesn't directly support the finding of all siblings, if that is a requirement I'd add another property/field for depth, root has depth 0, all nodes below root have depth 1, and so on. All nodes with the same depth are siblings.

Resources