How to insert duplicate keys into b trees - algorithm

Please answer on b trees and not b+ trees.
I have 2 questions.
What happens when you insert duplicated keys to a b tree?
For the following input how will the b tree with t=3 look like?
1,1,1,1,1,1,1,1,1,1,1,1,1,1
Can a parent node in a b tree with t=3 look like this?
1,1,4,10?
If so will the son between the key "1" and the second key" 1" contain only the value "1" ?

Just like hash tables, each node in the tree should store a link to a list of items associated with that key. You will store unique keys in the tree but the links will point to a list with possibly multiple items:
[node, key=1, ptr=l], l={1,1,1,1,1,1,1...}

Related

Can an enumeration be a class? - C++ [enum class] [Theory]

My End Goal:
Create the implementation of a hash-table from scratch. The twist, if the number of entries in a hash bucket is greater than 10 it is stored in Binary Search Tree, or else it is stored in a Linked List.
In my knowledge the only way to be able to achieve this is through a
enum class type_name { a, b };
My Question: Can 'a', and 'b' be classes?
Thought Process:
So to implement a hash table, I am thinking to make an array of the enumerated class this way, as soon the Linked List at any index of the array it will be replaced with a Binary Search Tree.
If this is not possible, what would be the best way to achieve this? My implementation for Linked List and Binary Search Tree are complete and work perfectly.
Note: I am not looking for a complete implemenation/ full code. I would like to be able to code it myself but I think my theory is flawed.
Visualization of My Idea
----------------------------------H A S H T A B L E---------------------------------------
enum class Hash { LinkedList, Tree };
INDEXES: 0 1 2 3 4
Hash eg = new Hash [ LinkedList, LinkedList, LinkedList, LinkedList, LinkedList ]
//11th element is inserted into eg[2]
//Method to Replace Linked List with Binary Search Tree
if (eg[1].getSize() > 10) {
Tree toReplace();
Node *follow = eg[1].headptr; //Each linked list is made of connected
//headptr is a pointer to the first element of the linked list
while ( follow != nullptr ){
toReplace.insert(follow->value);
follow = follow.next() //Next is the pointer to the next element in the linked list
}
}
//Now, the Linked List at eg[2] is replaced with a Binary Search Tree
Hash eg = new Hash [ LinkedList, LinkedList, Tree, LinkedList, LinkedList ]
Short answer: No.
An enumeration is a distinct type whose value is restricted to a range
of values (see below for details), which may include several
explicitly named constants ("enumerators"). The values of the
constants are values of an integral type known as the underlying type
of the enumeration.
http://en.cppreference.com/w/cpp/language/enum
Classes will not be 'values of an integral type'.
You may be able to achieve what you want with a tuple.
http://en.cppreference.com/w/cpp/utility/tuple

How to create a binary tree that also links grandparents to grandchildren?

Is there is some algorithm to reach the grandchildren of a binary tree? Like the example?
In the picture, there are nodes linking grandparents to their grandchildren, whereas a normal binary tree only links children to parents. What algorithm would one use to link to grandparents?
EDIT:
each node has an index and two values.
[index]
[value value];
What im trying to do:
index[3] and index[4] = value[0];
index[5] and index[6] = value[1];
index[7] and index[8] = value[2];
index[9] and index[10] = value[3];
.... ETC
Typically you construct each node in a binary tree with two pointers: a left child pointer 'node.left' and a right child pointer node 'node.right'. Then the four grandchildren of a node could be located with the expressions 'node.left.left', 'node.left.right', 'node.right.left', and 'node.right.right'. These expressions will evaluate very quickly.
Accessing the grandchildren via this technique will make everything much simpler for the person who has to maintain your code, which might even be you ten months from now after you have had time to forget that you ever had this discussion.
If you insist on storing the grandchild pointers redundantly then you will need four additional pointers per node: 'node.leftleft', 'node.leftright', 'node.rightleft', and 'node.rightright'.
This feels like the very definition a bad idea. Not only will the tree be big and clumsy, but every time you add or delete a node you will find yourself updating a metric barrowload of pointers. In order to recoup the time you will spend debugging such a mess, you will have to use the program for about nine thousand years.
0
/ \
1 2
/ \ / \
3 4 5 6
[0,1,2,3,4,5,6] Binary Tree as Array
left = 2n+1
right = 2n+2
left-left-grandChild = 2(2n+1)+1 => 4n+3
left-right-grandChild = 2(2n+1)+2 => 4n+4
right-left-grandChild = 2(2n+2)+1 => 4n+5
right-right-grandChild = 2(2n+2)+2 => 4n+6

What is a HASH TABLE when doing HASH JOIN?

In HASH JOIN method of oracle, HASH TABLE will be built on one of the tables and other will be joined depending on the values in the hash table.
Could you please let me know what is Hash table? What is the structure of hash table? how will it be created?
A hash table is a table where you can store stuff by the use of a key. It is like an array but stores things differently
a('CanBeVarchar') := 1; -- A hash table
In oracle, they are called associative arrays or index by tables. and you make one like this:
TYPE aHashTable IS TABLE OF [number|varchar2|user-defined-types] INDEX BY VARCHAR2(30);
myTable aHashTable;
So, what is it? it's just a bunch of key-value pairs. The data is stored as a linked list with head nodes that group the data by the use of something called HashCode to find things faster. Something like this:
a -> b -> c
Any Bitter Class
Array Bold Count
Say you are storing random words and it's meaning (a dictionary); when you store a word that begins with a, it is stored in the 'a' group. So, say you want this myTable('Albatroz') := 'It's a bird', the hash code will be calculated and put in the A head node, where it belongs: just above the 'Any'. a, has a link to Any, which has a link to Array and so on.
Now, the cool thing about it is that you get fast data retreival, say you want the meaning of Count, you do this definition := myTable('Count'); It will ignore searching for Any, Array, Bitter, Bold. Will search directly in the C head node, going trhough Class and finally Count; that is fast!
Here a wikipedia Link: http://en.wikipedia.org/wiki/Hash_table
Note that my example is oversimplified read with a little bit of more detail in the link.
Read more details like the load factor: What happens if i get a LOT of elements in the a group and few in the b and c; now searching for a word that begins with a is not very optinmal, is it? the hash table uses the load factor to reorganize and distribute the load of each node, for example, the table can be converted to subgroups:
From this
a b -> c
Any Bitter Class
Anode Bold Count
Anti
Array
Arrays
Arrow
To this
an -> ar b -> c
Any Array Bitter Class
Anode Arrays Bold Count
Anti Arrow
Now looking for words like Arrow will be faster.

Basic prefix tree implementation question

I've implemented a basic prefix tree or "trie". The trie consists of nodes like this:
// pseudo-code
struct node {
char c;
collection<node> childnodes;
};
Say I add the following words to my trie: "Apple", "Ark" and "Cat". Now when I look-up prefixes like "Ap" and "Ca" my trie's "bool containsPrefix(string prefix)" method will correctly return true.
Now I'm implementing the method "bool containsWholeWord(string word)" that will return true for "Cat" and "Ark" but false for "App" (in the above example).
Is it common for nodes in a trie to have some sort of "endOfWord" flag? This would help determine if the string being looked-up was actually a whole word entered into the trie and not just a prefix.
Cheers!
The end of the key is usually indicated via a leaf node. Either:
the child nodes are empty; or
you have a branch, with one prefix of the key, and some children nodes.
Your design doesn't have a leaf/empty node. Try indicating it with e.g. a null.
If you need to store both "App" and "Apple", but not "Appl", then yes, you need something like an endOfWord flag.
Alternatively, you could fit it into your design by (sometimes) having two nodes with the same character. So "Ap" has to childnodes: The leaf node "p" and an internal node "p" with a child "l".

Trie implementation - Inserting elements into a trie

I am developing a Trie data-structure where each node represents a word. So words st, stack, stackoverflow and overflow will be arranged as
root
--st
---stack
-----stackoverflow
--overflow
My Trie uses a HashTable internally so all node lookup will take constant time. Following is the algorithm I came up to insert an item into the trie.
Check item existence in the trie. If exist, return, else goto step2.
Iterate each character in the key and check for the existence of the word. Do this until we get a node where the new value can be added as child. If no node found, it will be added under root node.
After insertion, rearrange the siblings of the node under which the new node was inserted. This will walk through all the siblings and compare against the newly inserted node. If any of the node starts with same characters that new node have, it will be moved from there and added as child of new node.
I am not sure that this is the correct way of implementing a trie. Any suggestions or improvements are welcome.
Language used : C++
The trie should look like this
ROOT
overflow/ \st
O O
\ack
O
\overflow
O
Normally you don't need to use hash tables as part of a trie; the trie itself is already an efficient index data structure. Of course you can do that.
But anyway, your step (2) should actually descend the trie during the search and not just query the hash function. In this way you find the insertion point readily and don't need to search for it later as a separate step.
I believe step (3) is wrong, you don't need to rearrange a trie and as a matter of fact you shouldn't be able to because it's only the additional string fragments that you store in the trie; see the picture above.
Following is the java code for insert algorithm.
public void insert(String s){
Node current = root;
if(s.length()==0) //For an empty character
current.marker=true;
for(int i=0;i<s.length();i++){
Node child = current.subNode(s.charAt(i));
if(child!=null){
current = child;
}
else{
current.child.add(new Node(s.charAt(i)));
current = current.subNode(s.charAt(i));
}
// Set marker to indicate end of the word
if(i==s.length()-1)
current.marker = true;
}
}
For a more detailed tutorial, refer here.

Resources