best data structure to represent a non-binary tree - data-structures

I have a hierarchical data structure. There is not much addition, deletion done to this structure, its mostly for reading and searching. I'm trying my best to find a good data structure to store this data to enable fast searching. All the examples/tutorials I have seen talk about some form of binary tree. Is there a data structure (tree) that will enable me to model this effectively. An alternative form I can think of is to use a graph, but I'm not sure about that.

B-Tree will be the best choice for your description because of its amazing performance in "reading and searching", it will enable you achieve log(n) for insertion/deletion/search, beside it's a cache friendly so you will get the minimum number of cache misses.

Related

what is data structure? simple straight forward explaination required

i have to explain what data structure is to someone, so what would be the easiest way to explain it? would it be right if i say
"Data structure is used to organize data(arrange data in some fashion) so that we can perform certain operation fastly with as little resource usage as possible"
How values are placed in locations together and their location addresses and indices are stored as values too.
And that as very abstract "structures" so one has linked lists, arrays, pointers, graphs, binary trees. And can do things with them (the algorithms). The capabilities like being sorted, needing sortedness, fast access and so on.
This is fundamental, not too complicated, and a good grasp of data
structures, the correct usage of data structures can solve problems
elegantly. For learning data structures a language like Pascal is more
beneficial than C.
In computer science, a data structure is a particular way of organizing data in a computer so that it can be used efficiently.
Source: wikipedia (https://en.wikipedia.org/wiki/Data_structure)
I would say what you wrote is pretty close. :)

Hashing Performances

I am studying for a test and in the study sheet it asks about the Performances of hashing it asks things such as
Add
Remove
Search / Contains
Space vs Time
Applications where it makes sense
Performance of Hashing means analysis of how various data structures using hash technique.
For finding out of performance of hashing, first find out which are available data structures which use hashing. Example could be hash tables, hash maps, hash trees etc.
After that have a round of performance testing for below operations on each of data structure.
Add an element
Remove
Search
Find out Time and space complexity for each operation and each data structure.
After that you will quite a good idea of where you can use which data structure or if a given a requirement you will be able to find out which data structure will be performance efficient.
Hope this helps and clears your doubts :)

A data structure with certain properties

I want to implement a data structure myself in C++11. What I'm planning to do is having a data structure with the following properties:
search. O(log(n))
insert. O(log(n))
delete. O(log(n))
iterate. O(n)
What I have been thinking about after research was implementing a balanced binary search tree. Are there other structures that would fulfill my needs? I am completely new to this topic and thought a question here would give me a good jumpstart.
First of all, using the existing standard library data types is definitely the way to go for production code. But since you are asking how to implement such data structures yourself, I assume this is mainly an educational exercise for you.
Binary search trees of some form (https://en.wikipedia.org/wiki/Self-balancing_binary_search_tree#Implementations) or B-trees (https://en.wikipedia.org/wiki/B-tree) and hash tables (https://en.wikipedia.org/wiki/Hash_table) are definitely the data structures that are usually used to accomplish efficient insertion and lookup. If you want to go wild you can combine the two by using a tree instead of a linked list to handle hash collisions (although this has a good potential to actually make your implementation slower if you don't make massive mistakes in sizing your hash table or in choosing an adequate hash function).
Since I'm assuming you want to learn something, you might want to have a look at minimal perfect hashing in the context of hash tables (https://en.wikipedia.org/wiki/Perfect_hash_function) although this only has uses in special applications (I had the opportunity to use a perfect minimal hash function exactly once). But it sure is fascinating. As you can see from the link above, the botany of search trees is virtually limitless in scope so you can also go wild on that front.

Which type of Tree Data Structure is suitable for efficient frequent pattern mining?

I am currently working on frequent pattern mining(FPM). I was googling about the data structures which can be used for FPM. My main concern is space-compactness of the data structures as am planning to use distributed algorithm over it (handling synchronization over a DS that fits in my main memory). The list of data structures i have come across are,
Prefix-Tree
Compact Prefix-Tree or Radix Tree
Prefix Hash Tree (PHT)
Burst Tree (currently reading how it works)
I dunno the order in which each data structure evolved. Can anyone tell me which DS (not limited to the DS mentioned above) is the best Data Structure that fits my requirements ?
P.S: currently am considering burst tree is the best known space-efficient data structure for FPM.
I agree that the question is broad. However, if you're looking for a space-efficient prefix tree, then I would strongly recommend a Burst Trie. I wrote an implementation and was able to squeeze a lot of space efficiency out of it for Stripe's latest Capture the Flag. (They had a problem which used 4 nodes at less than 500mb each that "required" a suffix tree.)
If you're looking for an implementation of an efficient burst trie then check mine out.
https://github.com/nbauernfeind/scala-burst-trie

Best tree structure for Multi-dimensional data

To organize multi-dimensional data,
What is the most useful and efficient tree data structure?
(eg, K-D-B tree, region quadtree, R-tree)
I want to know best search time and best space utilization tree structure.
It highly depends on how your data is distributed in the space and how you want to search for it (what are the criteria you query for?).
It is very easy to find the right quad-tree bin given a location in space, on the other hand it introduces more overhead than a well-shaped kd-tree. There is a reason why all of these techniques are still in use.
Specify the problem you want to solve with the data structure.
Different data structures, including trees and information about them and source code of their implementation is found at https://ece.uwaterloo.ca/~ece250/Algorithms/
Furthermore, runtime information and asymptotic analysis on different types of tree structures is found under section 4 at https://ece.uwaterloo.ca/~ece250/Lectures/Slides/
These are very useful and reliable and this way you can choose the best structure depending on your specific needs/ data
I hope this helps!

Resources