What is the need for asymmetric linked list - data-structures

I am studying data structures. I have come across Asymmetric linked list which states that it is a special type of double linked list in which
1. next link points to next node address
2. prev link points to current node address itself
But I wonder,
1. what are the advantages we get by designing such linked list?
2. what kind of applications this would be suitable for?
Could anyone kindly explain more on Asymmetric linked list. I googled but I could not find relevent answers. Thank you.
Source :http://en.wikipedia.org/wiki/Doubly_linked_list#Asymmetric_doubly-linked_list

I agree the Wiki page is misleading. Here is the difference between LL and ALL:
Open Linked List:
node.next = nextNode
node.prev = prevNode
Asymmetric Linked List:
node.next = nextNode
node.prev = prevNode.next
Note the difference prevNode vs prevNode.next.
While pointing to a pointer within node still preserves the ability to traverse list backwards (you can get prevNode address by subtracting from prevNode.next) it may simplify insertion and deletion operations on the list, especially on the start element.

Given a node pointer from a double linked list, we can traverse all the nodes by the 'prev' and 'next', while a single linked list cannot do that if the pointer provided didn;t point to the first node.
E.g, delete a node from linked list. With single linked list, you have to traverse the list from head to find the specific node, and also need to record the prev node against the specific node, which causes the time complexity O(n). However, with double linked list, you can perform the delete with the specific node with the constant time.
In short, given a specific node, for single linked list, if we need to use its prev node information, the traverse wiht O(n) from the head is inevitable, while double lined list doesn't.
By the way, list in STL and LinkedList in Java are implemented with double linked list.

Because a picture worth thousands words :
As you can see, "previous" field is referencing "next", rather than previous element itself. This make little difference between nodes, except for first element : the previous field can point to the head rather than pointing to the last element (circular list) or be null.
The main advantage is for insertion and deletion : you don't need to take care of head and check if element is first one. Just having a single pointer to an element is enough to perform an insert or a delete to the list.
One disadvantage vs circular list : the only way to get last element (eg: to implement some "add last" operation) is to loop through the whole list.
You also lose the ability to loop through the list in reverse way (because no previous pointer), except if all elements have same size and you are allowed to do pointer arithmetic (as it is in C/C++).

Related

what does x.next.prev mean in linked lists (data structures)?

I understand that x.next is the next element to x, and x.prev is the previous element to x. But what does x.next.prev or x.prev.next mean?
I hope there are knowledgeable people here who have stackoverflow of knowledge :-)).
Thank you.
These forms make sense e.g. in adding and removing nodes of doubly linked lists, to save some wording.
Assume you have a doubly linked list like this:
Assume you are writing code that removes the node B from it. For that, you first make the "back" pointer of C point to A instead of B. It would be
C.prev = A
To achieve this:
If you only had a handle of B in your code, you would first find those A and C and then do the same:
C = B.next
A = B.prev
C.prev = A
Or, without the temporary variables:
B.next.prev = B.prev
This is essentially your x.next.prev. When later changing the forward pointer of A to point from B to C, you would write B.prev.next = B.next, which contains your second form x.prev.next.
For doubly linked lists, this is useful not only in removing, but also adding new nodes. Check out more pseudocode for that here.
Also note that in the situation in the diagram above the "invariant" B.next.prev == B does not hold. You can meet code that has to deal with this condition violation, for example, in lock-free algorithms. There, when reading x.next.prev you aren't sure that it is the same as x and have to deal with the situation when it's not.
"x.next" refers to the next node in the linked list. This node (like all of the nodes in the list) has a "next" and a "previous" pointer. The "previous" pointer of this node refers to the original node.
So x.next.previous refers to the current node, as long as "x.next" is non-null.
Likewise, x.prev.next refers to the current node, as long as "x.prev" is non-null.

Find number of leaves under each node of a tree

I have a tree which is represented in the following format:
nodes is a list of nodes in the tree in the order of their height from top. Node at height 0 is the first element of nodes. Nodes at height 1 (read from left to right) are the next elements of nodes and so on.
n_children is a list of integers such that n_children[i] = num children of nodes[i]
For example given a tree like {1: {2, 3:{4,5,2}}}, nodes=[1,2,3,4,5,2], n_children = [2,0,3,0,0,0].
Given a Tree, is it possible to generate nodes and n_children and the number of leaves corresponding to each node in nodes by traversing the tree only once?
Is such a representation unique? Or is it possible for two different trees to have the same representation?
For the first question - creating the representation given a tree:
I am assuming by "a given tree" we mean a tree that is given in the form of node-objects, each holding its value and a list of references to its children-node-objects.
I propose this algorithm:
Start at node=root.
if node.children is empty return {values_list:[[node.value]], children_list:[[0]]}
otherwise:
3.1. construct two lists. One will be called values_list and each element there shall be a list of values. The other will be called children_list and each element there shall be a list of integers. Each element in these two lists will represent a level in the sub-tree beginning with node, including node itself (will be added at step 3.3).
So values_list[1] will become the list of values of the children-nodes of node, and values_list[2] will become the list of values of the grandchildren-nodes of node. values_list[1][0] will be the value of the leftmost child-node of node. And values_list[0] will be a list with one element alone, values_list[0][0], which will be the value of node.
3.2. for each child-node of node (for which we have references through node.children):
3.2.1. start over at (2.) with the child-node set to node, and the returned results will be assigned back (when the function returns) to child_values_list and child_children_list accordingly.
3.2.2. for each index i in the lists (they are of same length) if there is a list already in values_list[i] - concatenate child_values_list[i] to values_list[i] and concatenate child_children_list[i] to children_list[i]. Otherwise assign values_list[i]=child_values_list[i] and children_list[i]=child.children.list[i] (that would be a push - adding to the end of the list).
3.3. Make node.value the sole element of a new list and add that list to the beginning of values_list. Make node.children.length the sole element of a new list and add that list to the beginning of children_list.
3.4. return values_list and children_list
when the above returns with values_list and children_list for node=root (from step (1)), all we need to do is concatenate the elements of the lists (because they are lists, each for one specific level of the tree). After concatenating the list-elements, the resulting values_list_concatenated and children_list_concatenated will be the wanted representation.
In the algorithm above we visit a node only by starting step (2) with it set as node and we do that only once for each child of a node we visit. We start at the root-node and each node has only one parent => every node is visited exactly once.
For the number of leaves associated with each node: (if I understand correctly - the number of leaves in the sub-tree a node is its root), we can add another list that will be generated and returned: leaves_list.
In the stop-case (no children to node - step (2)) we will return leaves_list:[[1]]. In step (3.2.2) we will concatenate the list-elements like the other two lists' list-elements. And in step (3.3) we will sum the first list-element leaves_list[0] and will make that sum the sole element in a new list that we will add to the beginning of leaves_list. (something like leaves_list.add_to_eginning([leaves_list[0].sum()]))
For the second question - is this representation unique:
To prove uniqueness we actually want to show that the function (let's call it rep for "representation") preserves distinctiveness over the space of trees. i.e. that it is an injection. As you can see in the wiki linked, for that it suffices to show that there exists a function (let's call it tre for "tree") that given a representation gives a tree back, and that for every tree t it holds that tre(rep(t))=t. In simple words - that we can make a method that takes a representation and builds a tree out of it, and for every tree if we make its representation and passes that representation through that methos we'll get the exact same tree back.
So let's get cracking!
Actually the first job - creating that method (the function tre) is already done by you - by the way you explained what the representation is. But let's make it explicit:
if the lists are empty return the empty tree. Otherwise continue
make the root node with values[0] as its value and n_children[0] as its number of children (without making the children nodes yet).
initiate a list-index i=1 and a level index li=1 and level-elements index lei=root.children.length and a next-level-elements accumulator nle_acc=0
while lei>0:
4.1. for lei times:
4.1.1. make a node with values[i] as value and n_children[i] as the number of children.
4.1.2. add the new node as the leftmost child in level li that has not been filled yet (traverse the tree to the li level from the leftmost in right direction and assign the new node to the first reference that is not assigned yet. We know the previous level is done, so each node in the li-1 level has a children.length property we can check and see if each has filled the number of children they should have)
4.1.3. add nle_acc+=n_children[i]
4.1.4. increment ++i
4.2. assign lei=nle_acc (level-elements can take what the accumulator gathered for it)
4.3. clear nle_acc=0 (next-level-elements accumulator needs to accumulate from the start for the next round)
Now we need to prove that an arbitrary tree that is passed through the first algorithm and then through the second algorithm (this one here) will get out of all of that the same as it was originally.
As I'm not trying to prove the corectness of the algorithms (although I should), let's assume they do what I intended them to do. i.e. the first one writes the representation as you described it, and the second one makes a tree level-by-level, left-to-right, assigning a value and the number of children from the representation and fills the children references according to those numbers when it comes to the next level.
So each node has the right amount of children according to the representation (that's how the children were filled), and that number was written from the tree (when generating the representation). And the same is true for the values and thus it is the same tree as the original.
The proof actually should be much more elaborate and detailed - but I think I'll leave it at that now. If there will be a demand for elaboration maybe I'll make it an actual proof.

Circular Linked list and the iterator

I have a question in my algorithm class in data structures.
For which of the following representations can all basic queue operations be performed in constant worst-case time?
To perform constant worst case time for the circular linked list, where should I have to keep the iterator?
They have given two choices:
Maintain an iterator that corresponds to the first item in the list
Maintain an iterator that corresponds to the last item in the list.
My answer is that to get the worst case time we should maintain the iterator that correspond to the last item in the list but I don't know how to justify and explain. So what are important points needed for this answer justification.
For which of the following representations can all basic queue operations be performed in constant worst-case time?
My answer is that to get the worst case time we should maintain the iterator that correspond to the last item
Assuming that your circular list is singly-linked, and that "the last item" in the circular list is the one that has been inserted the latest, your answer is correct *. In order to prove that you are right, you need to demonstrate how to perform these four operations in constant time:
Get the front element - Since the queue is circular and you have an iterator pointing to the latest inserted element, the next element from the latest inserted is the front element (i.e. the earliest inserted).
Get the back element - Since you maintain an iterator pointing to the latest inserted element, getting the back of the queue is a matter of dereferencing the iterator.
Enqueue - This is a matter of inserting after the iterator that you hold, and moving the iterator to the newly inserted item.
Dequeue - Copy the content of the front element (described in #1) into a temporary variable, re-point the next link of the latest inserted element to that of the front element, and delete the front element.
Since none of these operations require iterating the list, all of them can be performed in constant time.
* With doubly-linked circular lists both answers would be correct.

Break the linked list into smaller linked lists

I need to break a singly linked list into smaller linked lists after every 2 nodes . The approach I thought was,
create an array containign head pointers of n/2 objects
Link hop the linked list and store the address in the array after
every 2 nodes are encountered.
Can there be a better approach for this?
Thanks.
That seems like a good approach.
You also need to remember to set the next member of the 2nd, 4th, etc... elements to null to break the long list into smaller pieces. Remember to store the old value before you overwrite it as you will need to use it while you iterate.

Hashtable with doubly linked lists?

Introduction to Algorithms (CLRS) states that a hash table using doubly linked lists is able to delete items more quickly than one with singly linked lists. Can anybody tell me what is the advantage of using doubly linked lists instead of single linked list for deletion in Hashtable implementation?
The confusion here is due to the notation in CLRS. To be consistent with the true question, I use the CLRS notation in this answer.
We use the hash table to store key-value pairs. The value portion is not mentioned in the CLRS pseudocode, while the key portion is defined as k.
In my copy of CLR (I am working off of the first edition here), the routines listed for hashes with chaining are insert, search, and delete (with more verbose names in the book). The insert and delete routines take argument x, which is the linked list element associated with key key[x]. The search routine takes argument k, which is the key portion of a key-value pair. I believe the confusion is that you have interpreted the delete routine as taking a key, rather than a linked list element.
Since x is a linked list element, having it alone is sufficient to do an O(1) deletion from the linked list in the h(key[x]) slot of the hash table, if it is a doubly-linked list. If, however, it is a singly-linked list, having x is not sufficient. In that case, you need to start at the head of the linked list in slot h(key[x]) of the table and traverse the list until you finally hit x to get its predecessor. Only when you have the predecessor of x can the deletion be done, which is why the book states the singly-linked case leads to the same running times for search and delete.
Additional Discussion
Although CLRS says that you can do the deletion in O(1) time, assuming a doubly-linked list, it also requires you have x when calling delete. The point is this: they defined the search routine to return an element x. That search is not constant time for an arbitrary key k. Once you get x from the search routine, you avoid incurring the cost of another search in the call to delete when using doubly-linked lists.
The pseudocode routines are lower level than you would use if presenting a hash table interface to a user. For instance, a delete routine that takes a key k as an argument is missing. If that delete is exposed to the user, you would probably just stick to singly-linked lists and have a special version of search to find the x associated with k and its predecessor element all at once.
Unfortunately my copy of CLRS is in another country right now, so I can't use it as a reference. However, here's what I think it is saying:
Basically, a doubly linked list supports O(1) deletions because if you know the address of the item, you can just do something like:
x.left.right = x.right;
x.right.left = x.left;
to delete the object from the linked list, while as in a linked list, even if you have the address, you need to search through the linked list to find its predecessor to do:
pred.next = x.next
So, when you delete an item from the hash table, you look it up, which is O(1) due to the properties of hash tables, then delete it in O(1), since you now have the address.
If this was a singly linked list, you would need to find the predecessor of the object you wish to delete, which would take O(n).
However:
I am also slightly confused about this assertion in the case of chained hash tables, because of how lookup works. In a chained hash table, if there is a collision, you already need to walk through the linked list of values in order to find the item you want, and thus would need to also find its predecessor.
But, the way the statement is phrased gives clarification: "If the hash table supports deletion, then its linked lists should be doubly linked so that we can delete an item quickly. If the lists were only singly linked, then to delete element x, we would first have to find x in the list T[h(x.key)] so that we could update the next attribute of x’s predecessor."
This is saying that you already have element x, which means you can delete it in the above manner. If you were using a singly linked list, even if you had element x already, you would still have to find its predecessor in order to delete it.
I can think of one reason, but this isn't a very good one. Suppose we have a hash table of size 100. Now suppose values A and G are each added to the table. Maybe A hashes to slot 75. Now suppose G also hashes to 75, and our collision resolution policy is to jump forward by a constant step size of 80. So we try to jump to (75 + 80) % 100 = 55. Now, instead of starting at the front of the list and traversing forward 85, we could start at the current node and traverse backwards 20, which is faster. When we get to the node that G is at, we can mark it as a tombstone to delete it.
Still, I recommend using arrays when implementing hash tables.
Hashtable is often implemented as a vector of lists. Where index in vector is the key (hash).
If you don't have more than one value per key and you are not interested in any logic regarding those values a single linked list is enough. A more complex/specific design in selecting one of the values may require a double linked list.
Let's design the data structures for a caching proxy. We need a map from URLs to content; let's use a hash table. We also need a way to find pages to evict; let's use a FIFO queue to track the order in which URLs were last accessed, so that we can implement LRU eviction. In C, the data structure could look something like
struct node {
struct node *queueprev, *queuenext;
struct node **hashbucketprev, *hashbucketnext;
const char *url;
const void *content;
size_t contentlength;
};
struct node *queuehead; /* circular doubly-linked list */
struct node **hashbucket;
One subtlety: to avoid a special case and wasting space in the hash buckets, x->hashbucketprev points to the pointer that points to x. If x is first in the bucket, it points into hashbucket; otherwise, it points into another node. We can remove x from its bucket with
x->hashbucketnext->hashbucketprev = x->hashbucketprev;
*(x->hashbucketprev) = x->hashbucketnext;
When evicting, we iterate over the least recently accessed nodes via the queuehead pointer. Without hashbucketprev, we would need to hash each node and find its predecessor with a linear search, since we did not reach it via hashbucketnext. (Whether that's really bad is debatable, given that the hash should be cheap and the chain should be short. I suspect that the comment you're asking about was basically a throwaway.)
If the items in your hashtable are stored in "intrusive" lists, they can be aware of the linked list they are a member of. Thus, if the intrusive list is also doubly-linked, items can be quickly removed from the table.
(Note, though, that the "intrusiveness" can be seen as a violation of abstraction principles...)
An example: in an object-oriented context, an intrusive list might require all items to be derived from a base class.
class BaseListItem {
BaseListItem *prev, *next;
...
public: // list operations
insertAfter(BaseListItem*);
insertBefore(BaseListItem*);
removeFromList();
};
The performance advantage is that any item can be quickly removed from its doubly-linked list without locating or traversing the rest of the list.

Resources