Can an enumeration be a class? - C++ [enum class] [Theory] - c++11

My End Goal:
Create the implementation of a hash-table from scratch. The twist, if the number of entries in a hash bucket is greater than 10 it is stored in Binary Search Tree, or else it is stored in a Linked List.
In my knowledge the only way to be able to achieve this is through a
enum class type_name { a, b };
My Question: Can 'a', and 'b' be classes?
Thought Process:
So to implement a hash table, I am thinking to make an array of the enumerated class this way, as soon the Linked List at any index of the array it will be replaced with a Binary Search Tree.
If this is not possible, what would be the best way to achieve this? My implementation for Linked List and Binary Search Tree are complete and work perfectly.
Note: I am not looking for a complete implemenation/ full code. I would like to be able to code it myself but I think my theory is flawed.
Visualization of My Idea
----------------------------------H A S H T A B L E---------------------------------------
enum class Hash { LinkedList, Tree };
INDEXES: 0 1 2 3 4
Hash eg = new Hash [ LinkedList, LinkedList, LinkedList, LinkedList, LinkedList ]
//11th element is inserted into eg[2]
//Method to Replace Linked List with Binary Search Tree
if (eg[1].getSize() > 10) {
Tree toReplace();
Node *follow = eg[1].headptr; //Each linked list is made of connected
//headptr is a pointer to the first element of the linked list
while ( follow != nullptr ){
toReplace.insert(follow->value);
follow = follow.next() //Next is the pointer to the next element in the linked list
}
}
//Now, the Linked List at eg[2] is replaced with a Binary Search Tree
Hash eg = new Hash [ LinkedList, LinkedList, Tree, LinkedList, LinkedList ]

Short answer: No.
An enumeration is a distinct type whose value is restricted to a range
of values (see below for details), which may include several
explicitly named constants ("enumerators"). The values of the
constants are values of an integral type known as the underlying type
of the enumeration.
http://en.cppreference.com/w/cpp/language/enum
Classes will not be 'values of an integral type'.
You may be able to achieve what you want with a tuple.
http://en.cppreference.com/w/cpp/utility/tuple

Related

Hashtable underlying place holder?

I am trying to understand the HashTable data structure. I understand that in HashTable we first use HashFunction to coverts a key to hash Code and then using modulo operator to convert Hash code to integer index and which is used to get the location in HashTable where data is placed.
At a high level, the flow is like this?
Key -> Hash Function -> Hash code -> Modulo operator -> integer index -> Store in HashTable
Since the key is stored based on the index as emitted by the modulo operator, my doubt is, what is the underlying data structure which is used to hold the actual data? Is it an array, for array can be accessed using Index.
Can anyone help me understand this?
Though it completely depends on implementation, I would agree that underlying data structure would be array with linked list, since array is convinient to access elements at low cost, while linked list is necessary to handle hash collisions.
Here is example of details how it is implemented in java openjdk Hashtable
Initially it creates array with initial capacity:
table = new Entry<?,?>[initialCapacity];
It checks for capacity threshold everytime when new element is added. When threshold limit is reached it performs rehashing and creates a new array which is double size of old array
int newCapacity = (oldCapacity << 1) + 1;
if (newCapacity - MAX_ARRAY_SIZE > 0) {
if (oldCapacity == MAX_ARRAY_SIZE)
// Keep running with MAX_ARRAY_SIZE buckets
return;
newCapacity = MAX_ARRAY_SIZE;
}
Entry<?,?>[] newMap = new Entry<?,?>[newCapacity];
modCount++;
threshold = (int)Math.min(newCapacity * loadFactor, MAX_ARRAY_SIZE + 1);
table = newMap;
Hashtable Entry forms a linked list. It is used in case of hash collisions, since index for 2 different values would become same and required value is checked through linked list.
private static class Entry<K,V> implements Map.Entry<K,V> {
final int hash;
final K key;
V value;
Entry<K,V> next;
You may want to check other more simple implementations of Hashtables for better understanding.

Hash Tables and Separate Chaining: How do you know which value to return from the bucket's list?

We're learning about hash tables in my data structures and algorithms class, and I'm having trouble understanding separate chaining.
I know the basic premise: each bucket has a pointer to a Node that contains a key-value pair, and each Node contains a pointer to the next (potential) Node in the current bucket's mini linked list. This is mainly used to handle collisions.
Now, suppose for simplicity that the hash table has 5 buckets. Suppose I wrote the following lines of code in my main after creating an appropriate hash table instance.
myHashTable["rick"] = "Rick Sanchez";
myHashTable["morty"] = "Morty Smith";
Let's imagine whatever hashing function we're using just so happens to produce the same bucket index for both string keys rick and morty. Let's say that bucket index is index 0, for simplicity.
So at index 0 in our hash table, we have two nodes with values of Rick Sanchez and Morty Smith, in whatever order we decide to put them in (the first pointing to the second).
When I want to display the corresponding value for rick, which is Rick Sanchez per our code here, the hashing function will produce the bucket index of 0.
How do I decide which node needs to be returned? Do I loop through the nodes until I find the one whose key matches rick?
To resolve Hash Tables conflicts, that's it, to put or get an item into the Hash Table whose hash value collides with another one, you will end up reducing a map to the data structure that is backing the hash table implementation; this is generally a linked list. In the case of a collision this is the worst case for the Hash Table structure and you will end up with an O(n) operation to get to the correct item in the linked list. That's it, a loop as you said, that will search the item with the matching key. But, in the cases that you have a data structure like a balanced tree to search, it can be O(logN) time, as the Java8 implementation.
As JEP 180: Handle Frequent HashMap Collisions with Balanced Trees says:
The principal idea is that once the number of items in a hash bucket
grows beyond a certain threshold, that bucket will switch from using a
linked list of entries to a balanced tree. In the case of high hash
collisions, this will improve worst-case performance from O(n) to
O(log n).
This technique has already been implemented in the latest version of
the java.util.concurrent.ConcurrentHashMap class, which is also slated
for inclusion in JDK 8 as part of JEP 155. Portions of that code will
be re-used to implement the same idea in the HashMap and LinkedHashMap
classes.
I strongly suggest to always look at some existing implementation. To say about one, you could look at the Java 7 implementation. That will increase your code reading skills, that is almost more important or you do more often than writing code. I know that it is more effort but it will pay off.
For example, take a look at the HashTable.get method from Java 7:
public synchronized V get(Object key) {
Entry<?,?> tab[] = table;
int hash = key.hashCode();
int index = (hash & 0x7FFFFFFF) % tab.length;
for (Entry<?,?> e = tab[index] ; e != null ; e = e.next) {
if ((e.hash == hash) && e.key.equals(key)) {
return (V)e.value;
}
}
return null;
}
Here we see that if ((e.hash == hash) && e.key.equals(key)) is trying to find the correct item with the matching key.
And here is the full source code: HashTable.java

How does hashtable read correct values in case of collision?

I have some hashtable. For instance I have two entities like
john = { 1stname: jonh, 2ndname: johnson },
eric = { 1stname: eric, 2ndname: ericson }
Then I put them in hashtable:
ht["john"] = john;
ht["eric"] = eric;
Let's imagine there is a collision and hashtable use chaining to fix it. As a result there should be a linked list with these two entities like this
How does hashtable understand what entity should be returned for key? Hash values are the same and it knows nothing about entities structure. For instance if I write thisvar val = ht["john"]; how does hashtable (having only key value and its hash) find out that value should be john record and not eric.
I think what you are confused about is what is stored at each location in the hashtable's adjacent list. It seems like you assume that only the value is being stored. In fact, the data in each list node is a tuple (key, value).
Once you ask for ht['john'], the hashtable find the list associated with hash('john') and if the list is not empty it searches for the key 'john' in the list. If the key is found as the first element of the tuple then the value (second element of the tuple) is returned. If the key is not found, then it means that the element is not in the hashtable.
To summarize, the key hash is used to quickly identify the cell in which the element should be stored if present. Actual key equality is tested for to decide whether the key exists or not.
Is this what you are asking for? I have already put this in comments but seems to me you did not follow link
Collision Resolution in the Hashtable Class
Recall that when inserting an item into or retrieving an item from a hash table, a collision can occur. When inserting an item, an open slot must be found. When retrieving an item, the actual item must be found if it is not in the expected location. Earlier we briefly examined two collusion resolution strategies:
Linear probing
Quardratic probing
The Hashtable class uses a different technique referred to as rehasing. (Some sources refer to rehashing as double hashing.)
Rehasing works as follows: there is a set of hash different functions, H1 ... Hn, and when inserting or retrieving an item from the hash table, initially the H1 hash function is used. If this leads to a collision, H2 is tried instead, and onwards up to Hn if needed. The previous section showed only one hash function, which is the initial hash function (H1). The other hash functions are very similar to this function, only differentiating by a multiplicative factor. In general, the hash function Hk is defined as:
Hk(key) = [GetHash(key) + k * (1 + (((GetHash(key) >> 5) + 1) % (hashsize – 1)))] % hashsize
Mathematical Note With rehasing it is important that each slot in the hash table is visited exactly once when hashsize number of probes are made. That is, for a given key you don't want Hi and Hj to hash to the same slot in the hash table. With the rehashing formula used by the Hashtable class, this property is maintained if the result of (1 + (((GetHash(key) >> 5) + 1) % (hashsize – 1))and hashsize are relatively prime. (Two numbers are relatively prime if they share no common factors.) These two numbers are guaranteed to be relatively prime if hashsize is a prime number.
Rehasing provides better collision avoidance than either linear or quadratic probing.
sources here

Pseudo code to check if binary tree is a binary search tree - not sure about the recursion

I have homeowork to write pseudo code to check if a valid binary tree is a search binary tree.
I created an array to hold the in-order values of the tree. if the in-order values are in decreasing order it means it is indeed BST. However I've got some problem with the recursion in the method InOverArr.
I need to update the index of the array in order to submit the values to the array in the order they are at the tree.
I'm not sure the index is really updated properly during the recursion.. is it or not? and if you see some problem can you help me fix this? thanks a lot
pseudo code
first function
IsBST(node)
size ← TreeSize(node)
create new array TreeArr of Size number of cells
index ← 0
few comments:
now we use the IN_ORDER procedure with a small variation , I called the new version of the procedure: InOrderArr
the pseudo code of InOrderArr is described below IsBST
InOrderArr(node, TreeArr, index)
for i from 1 to size-1 do
if not (TreeArr[i] > TreeArr[i-1]) return
false
return true
second function
InOrderArr (node, Array, index)
if node = NULL then return
else
InOrderArr (node.left, Array, index)
treeArr[index] = node.key
index ← index + 1
InOrderArr (node.right, Array, index)
Return
Your code is generally correct. Just three notes.
The correctness of the code depends on the implementation, specifically on the way of index handling. Many programming languages pass arguments to subroutines by value. That means the subroutine receives a copy of the value and modifications made to the parameter have no effect on the original value. So incrementing index during execution of InOrderArr (node.left, Array, index) would not affect the position used by treeArr[index] = node.key. As a result only the rightmost path would be stored in the array.
To avoid that you'll have to ensure that index is passed by reference, so that incrementation done by a callee advances the position used later by a caller.
BST is usually defined so that the left subtreee of a node contains keys that are less than that node's key, and the right subtree contains nodes with greater keys – see Wikipedia's article on BST. Then the inorder traversal retrieves keys in ascending order. Why do you expect descending order?
Possibly it would be more efficient to drop the array and just recursively test a definition condition of BST?
Whenever we follow a left link we expect keys which are less than the current one. Whenever we follow the right link we expect keys greater the the current one. So for most subtrees there is some interval of keys values, defined by some ancestor nodes' keys. Just track those keys and test whether the key falls inside the current valid interval. Be sure to handle 'no left end defined' condition on the letfmost path and 'no right end' on the rightmost path of the tree. At the root node there's no ancestor yet, so the root key is not tested at all (any value is OK).
EDIT
C code draft:
// Test a node against its closest left-side and right-side ancestors
boolean isNodeBST(NODE *lt, NODE *node, NODE *rt)
{
if(node == NULL)
return true;
if(lt != NULL && node->key < lt->key)
return false;
if(rt != NULL && node->key > rt->key)
return false;
return
isNodeBST(lt, node->left, node) &&
isNodeBST(node, node->right, rt);
}
boolean isTreeBST(TREE *tree)
{
return isNodeBST( NULL, tree->root, NULL);
}

Trie implementation - Inserting elements into a trie

I am developing a Trie data-structure where each node represents a word. So words st, stack, stackoverflow and overflow will be arranged as
root
--st
---stack
-----stackoverflow
--overflow
My Trie uses a HashTable internally so all node lookup will take constant time. Following is the algorithm I came up to insert an item into the trie.
Check item existence in the trie. If exist, return, else goto step2.
Iterate each character in the key and check for the existence of the word. Do this until we get a node where the new value can be added as child. If no node found, it will be added under root node.
After insertion, rearrange the siblings of the node under which the new node was inserted. This will walk through all the siblings and compare against the newly inserted node. If any of the node starts with same characters that new node have, it will be moved from there and added as child of new node.
I am not sure that this is the correct way of implementing a trie. Any suggestions or improvements are welcome.
Language used : C++
The trie should look like this
ROOT
overflow/ \st
O O
\ack
O
\overflow
O
Normally you don't need to use hash tables as part of a trie; the trie itself is already an efficient index data structure. Of course you can do that.
But anyway, your step (2) should actually descend the trie during the search and not just query the hash function. In this way you find the insertion point readily and don't need to search for it later as a separate step.
I believe step (3) is wrong, you don't need to rearrange a trie and as a matter of fact you shouldn't be able to because it's only the additional string fragments that you store in the trie; see the picture above.
Following is the java code for insert algorithm.
public void insert(String s){
Node current = root;
if(s.length()==0) //For an empty character
current.marker=true;
for(int i=0;i<s.length();i++){
Node child = current.subNode(s.charAt(i));
if(child!=null){
current = child;
}
else{
current.child.add(new Node(s.charAt(i)));
current = current.subNode(s.charAt(i));
}
// Set marker to indicate end of the word
if(i==s.length()-1)
current.marker = true;
}
}
For a more detailed tutorial, refer here.

Resources