Removing from Doubly LinkedList in O(1) - data-structures

I know that's the time complexity to remove an element from doubly linkedlist is O(1) and it seems obvious and clear BUT what happens if we don't receive an element to remove it but the one which is wrapped by the element?
for example if we define a class Element to wrap a String (To provide pointer pointing to the next and previous element) , we can do the removal in O(1) if the method receives that element as input not the string !
if the remove method receives the string , it has to search through the list to find the corresponding element , right ? so the time complexity in this case would be O(n) , not O(1) anymore
class Element{
String content;
Element next;
Element previous;
}
class LinkedList{
...
public remove(String s){
//it has to first find the element corresponding to this String !
}
}

You are exactly right.
Remove is considered O(1) (when you do remove(Element)), but usually this goes together with a find operation (i.e. remove(String) or remove(find(String))), which is O(n).

This linked list (Circular, doubly linked list, how to manage the node to object relationship?) uses a very suspect method which inserts properties into the objects being added to the list. It makes removal O(1) without searching for the object, but at the risk of property name clashes and a performance penalty if you're adding primitive types to the list (they get boxed when the properties are added).

Related

Merge sort with insertion function

I have two sorted linked list l1 and l2. I wanted to merge l2 into l1 while keeping l1 sorted. My return statement is l1 sorted and l2 is an empty list. How do I approach this problem by using insert_before, insert_after, and remove_node function? Any ideas?
If you know how linked lists are implemented, you can easily just iterate through both of them, using something like the approach below. If you don't know too much about linked lists you might want to read up on those first.
if(l1 == null || l2 == null){
return l1==null ? l2 : l1;
indexA = indexB = 0;
elemA = l1.get(0);
elemB = l2.get(0);
while(indexA<l1.size() || indexB<l2.size()){
if(indexA==l1.size()){
l1.insert_after(l1.size()-1,elemB);
indexA++;
indexB++;
if(indexB<l2.size()){
elemB = l2.get(indexB);
}
continue;
}
if(indexB==l2.size()){
break;
}
if(elemA<elemB){
indexA++;
if(indexA<l1.size()){
elemA = l1.get(indexA);
}
continue;
}
if(elemA>elemB){
l1.insert_before(indexA,elemB);
indexA++;
indexB++;
if(indexB<l2.size()){
elemB = l2.get(indexB);
}
}
return l1;
this is a somehow java like implementation, it iterates through both lists and merges them together on the go. I decided to only return l1 as the empty list would be somewhat useless right?, all the inefficient get() calls could be avoided by writing an own simple LinkedList class to gain access to the next and last fields, which would greatly facilitate navigation.
If you could provide additional information about the methods you have available or the language this has to be written in I could probably provide you with better explanations.
Edit: Explanation
first both elemA /elemB are given the value of the first element in the linked lists then the while loop is used to iterate through them until the index values that are used to keep track of where we are in the lists are both too high (e.g. ==size of list)
Now the first two if statements are to check wheter one of the lists is already completely iterated through, if this is the case for the first list, the next element of the second list is just added behind the last element of l1, as it must be greater than it, otherwise indexA would not have been increased. (In this if statement indexA is also incremented because the size of the list is getting bigger and indexA is compared to l1.size() ), finally the continue statement skips to the next iteration of the loop.
The second if statement in the while loop tests if the second list has already completely been iterated through, if this is true there is nothing more to merge so the loop is stopped by the break statement.
below that is more or less classic mergesort;
if the current element of l1 is smaller than the current element of l2, just go to the next element of l1 and go to next iteration of loop, otherwise insert the current element of l2 before elemA and iterate further here indexA is also incremented because otherwise it would not point to the right element anymore as an element was inserted before it.
is this sufficient?
Edit: Added testing if either of the lists is null as suggested by #itwasntme
The functions insert_before, insert_after, and remove_node are not clearly defined in the question. So assuming one of the parameters is a reference to the "list", is the other parameter an index, an iterator, or a pointer to a node? If the lists are doubly linked lists, then it makes sense for the second parameter to be an iterator or a pointer to a node (each node has a pointer to previous node, so there's no need to scan according to an index to find the prior node).
In the case of Visual Studio's std::list::sort, it uses iterators and std::list::splice to move nodes within a list, using iterators to track the beginning and ending of sub-lists during a merge sort. The version of std::list::splice used by std::list::sort combines remove_node and insert_before into a single function.

Hash Tables and Separate Chaining: How do you know which value to return from the bucket's list?

We're learning about hash tables in my data structures and algorithms class, and I'm having trouble understanding separate chaining.
I know the basic premise: each bucket has a pointer to a Node that contains a key-value pair, and each Node contains a pointer to the next (potential) Node in the current bucket's mini linked list. This is mainly used to handle collisions.
Now, suppose for simplicity that the hash table has 5 buckets. Suppose I wrote the following lines of code in my main after creating an appropriate hash table instance.
myHashTable["rick"] = "Rick Sanchez";
myHashTable["morty"] = "Morty Smith";
Let's imagine whatever hashing function we're using just so happens to produce the same bucket index for both string keys rick and morty. Let's say that bucket index is index 0, for simplicity.
So at index 0 in our hash table, we have two nodes with values of Rick Sanchez and Morty Smith, in whatever order we decide to put them in (the first pointing to the second).
When I want to display the corresponding value for rick, which is Rick Sanchez per our code here, the hashing function will produce the bucket index of 0.
How do I decide which node needs to be returned? Do I loop through the nodes until I find the one whose key matches rick?
To resolve Hash Tables conflicts, that's it, to put or get an item into the Hash Table whose hash value collides with another one, you will end up reducing a map to the data structure that is backing the hash table implementation; this is generally a linked list. In the case of a collision this is the worst case for the Hash Table structure and you will end up with an O(n) operation to get to the correct item in the linked list. That's it, a loop as you said, that will search the item with the matching key. But, in the cases that you have a data structure like a balanced tree to search, it can be O(logN) time, as the Java8 implementation.
As JEP 180: Handle Frequent HashMap Collisions with Balanced Trees says:
The principal idea is that once the number of items in a hash bucket
grows beyond a certain threshold, that bucket will switch from using a
linked list of entries to a balanced tree. In the case of high hash
collisions, this will improve worst-case performance from O(n) to
O(log n).
This technique has already been implemented in the latest version of
the java.util.concurrent.ConcurrentHashMap class, which is also slated
for inclusion in JDK 8 as part of JEP 155. Portions of that code will
be re-used to implement the same idea in the HashMap and LinkedHashMap
classes.
I strongly suggest to always look at some existing implementation. To say about one, you could look at the Java 7 implementation. That will increase your code reading skills, that is almost more important or you do more often than writing code. I know that it is more effort but it will pay off.
For example, take a look at the HashTable.get method from Java 7:
public synchronized V get(Object key) {
Entry<?,?> tab[] = table;
int hash = key.hashCode();
int index = (hash & 0x7FFFFFFF) % tab.length;
for (Entry<?,?> e = tab[index] ; e != null ; e = e.next) {
if ((e.hash == hash) && e.key.equals(key)) {
return (V)e.value;
}
}
return null;
}
Here we see that if ((e.hash == hash) && e.key.equals(key)) is trying to find the correct item with the matching key.
And here is the full source code: HashTable.java

Can an enumeration be a class? - C++ [enum class] [Theory]

My End Goal:
Create the implementation of a hash-table from scratch. The twist, if the number of entries in a hash bucket is greater than 10 it is stored in Binary Search Tree, or else it is stored in a Linked List.
In my knowledge the only way to be able to achieve this is through a
enum class type_name { a, b };
My Question: Can 'a', and 'b' be classes?
Thought Process:
So to implement a hash table, I am thinking to make an array of the enumerated class this way, as soon the Linked List at any index of the array it will be replaced with a Binary Search Tree.
If this is not possible, what would be the best way to achieve this? My implementation for Linked List and Binary Search Tree are complete and work perfectly.
Note: I am not looking for a complete implemenation/ full code. I would like to be able to code it myself but I think my theory is flawed.
Visualization of My Idea
----------------------------------H A S H T A B L E---------------------------------------
enum class Hash { LinkedList, Tree };
INDEXES: 0 1 2 3 4
Hash eg = new Hash [ LinkedList, LinkedList, LinkedList, LinkedList, LinkedList ]
//11th element is inserted into eg[2]
//Method to Replace Linked List with Binary Search Tree
if (eg[1].getSize() > 10) {
Tree toReplace();
Node *follow = eg[1].headptr; //Each linked list is made of connected
//headptr is a pointer to the first element of the linked list
while ( follow != nullptr ){
toReplace.insert(follow->value);
follow = follow.next() //Next is the pointer to the next element in the linked list
}
}
//Now, the Linked List at eg[2] is replaced with a Binary Search Tree
Hash eg = new Hash [ LinkedList, LinkedList, Tree, LinkedList, LinkedList ]
Short answer: No.
An enumeration is a distinct type whose value is restricted to a range
of values (see below for details), which may include several
explicitly named constants ("enumerators"). The values of the
constants are values of an integral type known as the underlying type
of the enumeration.
http://en.cppreference.com/w/cpp/language/enum
Classes will not be 'values of an integral type'.
You may be able to achieve what you want with a tuple.
http://en.cppreference.com/w/cpp/utility/tuple

optimizing the data structure implementation

There is a stream of random characters coming like 'a''b''c''a'... and so on. At any given point in time when I query I need to get the first non repeating character. For example, for the input "abca", 'b' should be returned since a is repeated and the first non repeating character is 'b'.
There needs to be two methods, one for inserting and one for querying.
My solution is to have a linkedList to store the incoming stream characters. While I get the next character, I just compare with all the current characters and if present I will not insert into the end of linkedlist, else I will insert at the end. By this approach, the query will take O(1) since I will get the first element on the linkedlist and insert will take O(n) since I need to compare from the first element till the last element in the worst case.
Is there any better performing way?
Either you haven't explained your algorithm well or it won't return the correct result. In the example a b a, would your algorithm return a (because it is the first element in the linked list)?
Anyway, here is a modification that improves performance. The idea is to use a hash map from characters to (doubly) linked list nodes. This map can be used to determine if a character has already been inserted and to get to the required node quickly. We should allow a null value for the map target (instead of the list node) to express a character that has ocurred more than once already.
The insertion method works as follows:
Check if the map contains the current character (O(1)). If not, add it to the end of the list and add a reference to the map (O(1)).
If the character is already in the map: Check if the pointed to node is null (O(1)). If so, just ignore it. If it is not, remove the pointed to node from the list and update the reference to a null value (O(1)).
Overall, a O(1) operation.
The query works as in your previous solution.
Here is a C# implementation. It's basically a 1:1 translation of the above explanation:
class StreamAnalyzer
{
LinkedList<char> characterList = new LinkedList<char>();
Dictionary<char, LinkedListNode<char>> characterMap
= new Dictionary<char, LinkedListNode<char>>();
public void AddCharacter(char c)
{
LinkedListNode<char> referencedNode;
if (characterMap.TryGetValue(c, out referencedNode))
{
if(referencedNode != null)
{
characterList.Remove(referencedNode);
characterMap[c] = null;
}
}
else
{
var node = new LinkedListNode<char>(c);
characterList.AddLast(node);
characterMap.Add(c, node);
}
}
public char? GetFirstNonRepeatingCharacter()
{
if (characterList.First == null)
return null;
else
return characterList.First.Value;
}
}

Trie implementation - Inserting elements into a trie

I am developing a Trie data-structure where each node represents a word. So words st, stack, stackoverflow and overflow will be arranged as
root
--st
---stack
-----stackoverflow
--overflow
My Trie uses a HashTable internally so all node lookup will take constant time. Following is the algorithm I came up to insert an item into the trie.
Check item existence in the trie. If exist, return, else goto step2.
Iterate each character in the key and check for the existence of the word. Do this until we get a node where the new value can be added as child. If no node found, it will be added under root node.
After insertion, rearrange the siblings of the node under which the new node was inserted. This will walk through all the siblings and compare against the newly inserted node. If any of the node starts with same characters that new node have, it will be moved from there and added as child of new node.
I am not sure that this is the correct way of implementing a trie. Any suggestions or improvements are welcome.
Language used : C++
The trie should look like this
ROOT
overflow/ \st
O O
\ack
O
\overflow
O
Normally you don't need to use hash tables as part of a trie; the trie itself is already an efficient index data structure. Of course you can do that.
But anyway, your step (2) should actually descend the trie during the search and not just query the hash function. In this way you find the insertion point readily and don't need to search for it later as a separate step.
I believe step (3) is wrong, you don't need to rearrange a trie and as a matter of fact you shouldn't be able to because it's only the additional string fragments that you store in the trie; see the picture above.
Following is the java code for insert algorithm.
public void insert(String s){
Node current = root;
if(s.length()==0) //For an empty character
current.marker=true;
for(int i=0;i<s.length();i++){
Node child = current.subNode(s.charAt(i));
if(child!=null){
current = child;
}
else{
current.child.add(new Node(s.charAt(i)));
current = current.subNode(s.charAt(i));
}
// Set marker to indicate end of the word
if(i==s.length()-1)
current.marker = true;
}
}
For a more detailed tutorial, refer here.

Resources