To organize multi-dimensional data,
What is the most useful and efficient tree data structure?
(eg, K-D-B tree, region quadtree, R-tree)
I want to know best search time and best space utilization tree structure.
It highly depends on how your data is distributed in the space and how you want to search for it (what are the criteria you query for?).
It is very easy to find the right quad-tree bin given a location in space, on the other hand it introduces more overhead than a well-shaped kd-tree. There is a reason why all of these techniques are still in use.
Specify the problem you want to solve with the data structure.
Different data structures, including trees and information about them and source code of their implementation is found at https://ece.uwaterloo.ca/~ece250/Algorithms/
Furthermore, runtime information and asymptotic analysis on different types of tree structures is found under section 4 at https://ece.uwaterloo.ca/~ece250/Lectures/Slides/
These are very useful and reliable and this way you can choose the best structure depending on your specific needs/ data
I hope this helps!
Related
I have a hierarchical data structure. There is not much addition, deletion done to this structure, its mostly for reading and searching. I'm trying my best to find a good data structure to store this data to enable fast searching. All the examples/tutorials I have seen talk about some form of binary tree. Is there a data structure (tree) that will enable me to model this effectively. An alternative form I can think of is to use a graph, but I'm not sure about that.
B-Tree will be the best choice for your description because of its amazing performance in "reading and searching", it will enable you achieve log(n) for insertion/deletion/search, beside it's a cache friendly so you will get the minimum number of cache misses.
I am currently studying advanced data structures and I came across a weird data structure called Treap. I understand what Treap is but I can't seem to find it's utility in a valid use case scenario.
Why should you use such a data structure and in what type of problems/conditions treaps are best used?
I find myself much more into using either hash maps, min/max heaps, binary search tree or balanced binary search trees, but I can't tell on why should you use a treap.
They are easier to implement and more importantly, that makes them easier to modify/maintain into the future if you want to make slight variations on them or change them some way. They also allow for efficient parallel versions of set operations Union/Intersect/Difference which is extremely valuable. Using them simultaneously as a heap and binary tree isn't really very handy unless the stuff you use for priorities are coincidentally really nicely randomly distributed/permuted. I suppose there might be a case where that would be handy, but it seems really unlikely. Stuff so randomly distributed is usually more like a hash key which typically aren't useful as ordered data. How often do you want to pull people out in order of their SSNs? I guess it's possible but unlikely.
i have to explain what data structure is to someone, so what would be the easiest way to explain it? would it be right if i say
"Data structure is used to organize data(arrange data in some fashion) so that we can perform certain operation fastly with as little resource usage as possible"
How values are placed in locations together and their location addresses and indices are stored as values too.
And that as very abstract "structures" so one has linked lists, arrays, pointers, graphs, binary trees. And can do things with them (the algorithms). The capabilities like being sorted, needing sortedness, fast access and so on.
This is fundamental, not too complicated, and a good grasp of data
structures, the correct usage of data structures can solve problems
elegantly. For learning data structures a language like Pascal is more
beneficial than C.
In computer science, a data structure is a particular way of organizing data in a computer so that it can be used efficiently.
Source: wikipedia (https://en.wikipedia.org/wiki/Data_structure)
I would say what you wrote is pretty close. :)
I want to implement a data structure myself in C++11. What I'm planning to do is having a data structure with the following properties:
search. O(log(n))
insert. O(log(n))
delete. O(log(n))
iterate. O(n)
What I have been thinking about after research was implementing a balanced binary search tree. Are there other structures that would fulfill my needs? I am completely new to this topic and thought a question here would give me a good jumpstart.
First of all, using the existing standard library data types is definitely the way to go for production code. But since you are asking how to implement such data structures yourself, I assume this is mainly an educational exercise for you.
Binary search trees of some form (https://en.wikipedia.org/wiki/Self-balancing_binary_search_tree#Implementations) or B-trees (https://en.wikipedia.org/wiki/B-tree) and hash tables (https://en.wikipedia.org/wiki/Hash_table) are definitely the data structures that are usually used to accomplish efficient insertion and lookup. If you want to go wild you can combine the two by using a tree instead of a linked list to handle hash collisions (although this has a good potential to actually make your implementation slower if you don't make massive mistakes in sizing your hash table or in choosing an adequate hash function).
Since I'm assuming you want to learn something, you might want to have a look at minimal perfect hashing in the context of hash tables (https://en.wikipedia.org/wiki/Perfect_hash_function) although this only has uses in special applications (I had the opportunity to use a perfect minimal hash function exactly once). But it sure is fascinating. As you can see from the link above, the botany of search trees is virtually limitless in scope so you can also go wild on that front.
I am currently working on frequent pattern mining(FPM). I was googling about the data structures which can be used for FPM. My main concern is space-compactness of the data structures as am planning to use distributed algorithm over it (handling synchronization over a DS that fits in my main memory). The list of data structures i have come across are,
Prefix-Tree
Compact Prefix-Tree or Radix Tree
Prefix Hash Tree (PHT)
Burst Tree (currently reading how it works)
I dunno the order in which each data structure evolved. Can anyone tell me which DS (not limited to the DS mentioned above) is the best Data Structure that fits my requirements ?
P.S: currently am considering burst tree is the best known space-efficient data structure for FPM.
I agree that the question is broad. However, if you're looking for a space-efficient prefix tree, then I would strongly recommend a Burst Trie. I wrote an implementation and was able to squeeze a lot of space efficiency out of it for Stripe's latest Capture the Flag. (They had a problem which used 4 nodes at less than 500mb each that "required" a suffix tree.)
If you're looking for an implementation of an efficient burst trie then check mine out.
https://github.com/nbauernfeind/scala-burst-trie