I need to implement a simple (but not binary) tree in Julia. Basically, each node needs to have an integer ID, and I need a convenient way to get a list of children for a node + add child to an existing node by ID.
e.g. 0->1->(2->(3,4,5),6)
where each number represents a node, I need the functions children(2) and add(7 as child of 4).
I am aware that similar tree implementations can be found for other languages, but I am pretty new to OOP/classes/data structures and not managing to "translate" them to Julia.
You didn't state whether you want the IDs to be assigned automatically as new nodes are added, or if you want to specify them when you add the children (which would involve some form of more complicated lookup).
If the IDs can be assigned, you could implement a tree structure as follows:
type TreeNode
parent::Int
children::Vector{Int}
end
type Tree
nodes::Vector{TreeNode}
end
Tree() = Tree([TreeNode(0, Vector{Int}())])
function addchild(tree::Tree, id::Int)
1 <= id <= length(tree.nodes) || throw(BoundsError(tree, id))
push!(tree.nodes, TreeNode(id, Vector{}()))
child = length(tree.nodes)
push!(tree.nodes[id].children, child)
child
end
children(tree, id) = tree.nodes[id].children
parent(tree,id) = tree.nodes[id].parent
Otherwise, you might want to use Dict{Int,TreeNode} to store the tree nodes.
Related
We're learning about hash tables in my data structures and algorithms class, and I'm having trouble understanding separate chaining.
I know the basic premise: each bucket has a pointer to a Node that contains a key-value pair, and each Node contains a pointer to the next (potential) Node in the current bucket's mini linked list. This is mainly used to handle collisions.
Now, suppose for simplicity that the hash table has 5 buckets. Suppose I wrote the following lines of code in my main after creating an appropriate hash table instance.
myHashTable["rick"] = "Rick Sanchez";
myHashTable["morty"] = "Morty Smith";
Let's imagine whatever hashing function we're using just so happens to produce the same bucket index for both string keys rick and morty. Let's say that bucket index is index 0, for simplicity.
So at index 0 in our hash table, we have two nodes with values of Rick Sanchez and Morty Smith, in whatever order we decide to put them in (the first pointing to the second).
When I want to display the corresponding value for rick, which is Rick Sanchez per our code here, the hashing function will produce the bucket index of 0.
How do I decide which node needs to be returned? Do I loop through the nodes until I find the one whose key matches rick?
To resolve Hash Tables conflicts, that's it, to put or get an item into the Hash Table whose hash value collides with another one, you will end up reducing a map to the data structure that is backing the hash table implementation; this is generally a linked list. In the case of a collision this is the worst case for the Hash Table structure and you will end up with an O(n) operation to get to the correct item in the linked list. That's it, a loop as you said, that will search the item with the matching key. But, in the cases that you have a data structure like a balanced tree to search, it can be O(logN) time, as the Java8 implementation.
As JEP 180: Handle Frequent HashMap Collisions with Balanced Trees says:
The principal idea is that once the number of items in a hash bucket
grows beyond a certain threshold, that bucket will switch from using a
linked list of entries to a balanced tree. In the case of high hash
collisions, this will improve worst-case performance from O(n) to
O(log n).
This technique has already been implemented in the latest version of
the java.util.concurrent.ConcurrentHashMap class, which is also slated
for inclusion in JDK 8 as part of JEP 155. Portions of that code will
be re-used to implement the same idea in the HashMap and LinkedHashMap
classes.
I strongly suggest to always look at some existing implementation. To say about one, you could look at the Java 7 implementation. That will increase your code reading skills, that is almost more important or you do more often than writing code. I know that it is more effort but it will pay off.
For example, take a look at the HashTable.get method from Java 7:
public synchronized V get(Object key) {
Entry<?,?> tab[] = table;
int hash = key.hashCode();
int index = (hash & 0x7FFFFFFF) % tab.length;
for (Entry<?,?> e = tab[index] ; e != null ; e = e.next) {
if ((e.hash == hash) && e.key.equals(key)) {
return (V)e.value;
}
}
return null;
}
Here we see that if ((e.hash == hash) && e.key.equals(key)) is trying to find the correct item with the matching key.
And here is the full source code: HashTable.java
My End Goal:
Create the implementation of a hash-table from scratch. The twist, if the number of entries in a hash bucket is greater than 10 it is stored in Binary Search Tree, or else it is stored in a Linked List.
In my knowledge the only way to be able to achieve this is through a
enum class type_name { a, b };
My Question: Can 'a', and 'b' be classes?
Thought Process:
So to implement a hash table, I am thinking to make an array of the enumerated class this way, as soon the Linked List at any index of the array it will be replaced with a Binary Search Tree.
If this is not possible, what would be the best way to achieve this? My implementation for Linked List and Binary Search Tree are complete and work perfectly.
Note: I am not looking for a complete implemenation/ full code. I would like to be able to code it myself but I think my theory is flawed.
Visualization of My Idea
----------------------------------H A S H T A B L E---------------------------------------
enum class Hash { LinkedList, Tree };
INDEXES: 0 1 2 3 4
Hash eg = new Hash [ LinkedList, LinkedList, LinkedList, LinkedList, LinkedList ]
//11th element is inserted into eg[2]
//Method to Replace Linked List with Binary Search Tree
if (eg[1].getSize() > 10) {
Tree toReplace();
Node *follow = eg[1].headptr; //Each linked list is made of connected
//headptr is a pointer to the first element of the linked list
while ( follow != nullptr ){
toReplace.insert(follow->value);
follow = follow.next() //Next is the pointer to the next element in the linked list
}
}
//Now, the Linked List at eg[2] is replaced with a Binary Search Tree
Hash eg = new Hash [ LinkedList, LinkedList, Tree, LinkedList, LinkedList ]
Short answer: No.
An enumeration is a distinct type whose value is restricted to a range
of values (see below for details), which may include several
explicitly named constants ("enumerators"). The values of the
constants are values of an integral type known as the underlying type
of the enumeration.
http://en.cppreference.com/w/cpp/language/enum
Classes will not be 'values of an integral type'.
You may be able to achieve what you want with a tuple.
http://en.cppreference.com/w/cpp/utility/tuple
Is there is some algorithm to reach the grandchildren of a binary tree? Like the example?
In the picture, there are nodes linking grandparents to their grandchildren, whereas a normal binary tree only links children to parents. What algorithm would one use to link to grandparents?
EDIT:
each node has an index and two values.
[index]
[value value];
What im trying to do:
index[3] and index[4] = value[0];
index[5] and index[6] = value[1];
index[7] and index[8] = value[2];
index[9] and index[10] = value[3];
.... ETC
Typically you construct each node in a binary tree with two pointers: a left child pointer 'node.left' and a right child pointer node 'node.right'. Then the four grandchildren of a node could be located with the expressions 'node.left.left', 'node.left.right', 'node.right.left', and 'node.right.right'. These expressions will evaluate very quickly.
Accessing the grandchildren via this technique will make everything much simpler for the person who has to maintain your code, which might even be you ten months from now after you have had time to forget that you ever had this discussion.
If you insist on storing the grandchild pointers redundantly then you will need four additional pointers per node: 'node.leftleft', 'node.leftright', 'node.rightleft', and 'node.rightright'.
This feels like the very definition a bad idea. Not only will the tree be big and clumsy, but every time you add or delete a node you will find yourself updating a metric barrowload of pointers. In order to recoup the time you will spend debugging such a mess, you will have to use the program for about nine thousand years.
0
/ \
1 2
/ \ / \
3 4 5 6
[0,1,2,3,4,5,6] Binary Tree as Array
left = 2n+1
right = 2n+2
left-left-grandChild = 2(2n+1)+1 => 4n+3
left-right-grandChild = 2(2n+1)+2 => 4n+4
right-left-grandChild = 2(2n+2)+1 => 4n+5
right-right-grandChild = 2(2n+2)+2 => 4n+6
I'm having a hell of a time trying to figure this one out. Everywhere I look, I seem to be only running into explanations on how to actually traverse through the list non-recursively (the part I actually understand). Can anyone out there hammer in how exactly I can go through the list initially and find the actual predecessor/successor nodes so I can flag them in the node class? I need to be able to create a simple Binary Search Tree and go through the list and reroute the null links to the predecessor/successor. I've had some luck with a solution somewhat like the following:
thread(node n, node p) {
if (n.left !=null)
thread (n.left, n);
if (n.right !=null) {
thread (n.right, p);
}
n.right = p;
}
From your description, I'll assume you have a node with a structure looking something like:
Node {
left
right
}
... and that you have a binary tree of these set up using the left and right, and that you want to re-assign values to left and right such that it creates a doublely-linked-list from a depth first traversal of the tree.
The root (no pun intended) problem with what you've got so far is that the "node p" (short for previous?) that is passed during the traversal needs to be independent of where in the tree you currently are - it always needs to contain the previously visited node. To do that, each time thread is run it needs to reference the same "previous" variable. I've done some Python-ish pseudo code with one C-ism - if you're not familiar, '&' means "reference to" (or "ref" in C#), and '*' means "dereference and give me the object it is pointing to".
Node lastVisited
thread(root, &lastVisisted)
function thread(node, lastVisitedRef)
if (node.left)
thread(node.left, lastVisitedRef)
if (node.right)
thread(node.right, lastVisitedRef)
// visit this node, reassigning left and right
if (*lastVisitedRef)
node.right = *lastVisitedRef
(*lastVisitedRef).left = node
// update reference lastVisited
lastVisitedRef = &node
If you were going to implement this in C, you'd actually need a double pointer to hold the reference, but the idea is the same - you need to persist the location of the "last visited node" during the entire traversal.
I am developing a Trie data-structure where each node represents a word. So words st, stack, stackoverflow and overflow will be arranged as
root
--st
---stack
-----stackoverflow
--overflow
My Trie uses a HashTable internally so all node lookup will take constant time. Following is the algorithm I came up to insert an item into the trie.
Check item existence in the trie. If exist, return, else goto step2.
Iterate each character in the key and check for the existence of the word. Do this until we get a node where the new value can be added as child. If no node found, it will be added under root node.
After insertion, rearrange the siblings of the node under which the new node was inserted. This will walk through all the siblings and compare against the newly inserted node. If any of the node starts with same characters that new node have, it will be moved from there and added as child of new node.
I am not sure that this is the correct way of implementing a trie. Any suggestions or improvements are welcome.
Language used : C++
The trie should look like this
ROOT
overflow/ \st
O O
\ack
O
\overflow
O
Normally you don't need to use hash tables as part of a trie; the trie itself is already an efficient index data structure. Of course you can do that.
But anyway, your step (2) should actually descend the trie during the search and not just query the hash function. In this way you find the insertion point readily and don't need to search for it later as a separate step.
I believe step (3) is wrong, you don't need to rearrange a trie and as a matter of fact you shouldn't be able to because it's only the additional string fragments that you store in the trie; see the picture above.
Following is the java code for insert algorithm.
public void insert(String s){
Node current = root;
if(s.length()==0) //For an empty character
current.marker=true;
for(int i=0;i<s.length();i++){
Node child = current.subNode(s.charAt(i));
if(child!=null){
current = child;
}
else{
current.child.add(new Node(s.charAt(i)));
current = current.subNode(s.charAt(i));
}
// Set marker to indicate end of the word
if(i==s.length()-1)
current.marker = true;
}
}
For a more detailed tutorial, refer here.