Break the linked list into smaller linked lists - data-structures

I need to break a singly linked list into smaller linked lists after every 2 nodes . The approach I thought was,
create an array containign head pointers of n/2 objects
Link hop the linked list and store the address in the array after
every 2 nodes are encountered.
Can there be a better approach for this?
Thanks.

That seems like a good approach.
You also need to remember to set the next member of the 2nd, 4th, etc... elements to null to break the long list into smaller pieces. Remember to store the old value before you overwrite it as you will need to use it while you iterate.

Related

Data structures linked list

I was asked to solve this out:
There are two singly linked lists.
I need to write a method which get those two linked lists and returns a pointer to the starting point where the suffix is the same in those two lists.
Example:
given:
1->2->4->6->8->10->15
2->4->8->10->15
the returned value would be a pointer to the member - 8.
But,
I need to do it without changing the lists or using more memory,
and - we need to scan the lists only once, means T(n)=O(n).
measure the lengths of both lists
skip forward in the longer list until they're the same length
walk forward through both lists in step, and remember the nodes AFTER the last point where the lists have different elements. These are the pointers you can return.
Now... I know you said that you only want to scan the lists once, and I have done it twice, but you also said that this means T(n) = O(n), and that is not correct.
Scanning the lists twice is also in O(n) and is required to solve the problem without using unbounded extra memory.
This is a psuedocode.. not Python code for that matter
take the two lists and let p1 point to longer list and p2 point to shorter list,
while p1->next!=NULL:
if p1->value = p2->value:
p1 = p1->next
p2 = p2->next
else:
p1 = p1->next
returnpointer = p1->next// if it happens that some elements are same towards end but not the last element.. p1->next would be pointing to NULL anyways and else.. it'll always be the element next to the last element which wasn't same
return returnpointer

Is there such data structure - "linked list with samples"

Is there such data structure:
There is slow list data structure such linked list or data saved on disk.
There is relatively small array of pointers to some of the elements in the "slow list", hopefully evenly distributed.
Then when you do search, you first check the array and then perform the normal search (linked list search or binary search in case of disk data).
This looks very similar to jump search, sample search and to skip lists, but I think is different algorithm.
Please note I am giving example with link list or file on disk, because they are slow structures.
I don't know if there's a name for this algorithm (I don't think it deserves one, though if there isn't, it could bear mine:), but I did implement something like that 10 years ago for an interview.
You can have an array of pointers to the elements of a list. An array of fixed size, say, of 256 pointers. When you construct the list or traverse it for the first time, you store pointers to its elements in the array. So, for a list of 256 or fewer elements you'd have a pointer to each element.
As the list grows beyond 256 elements, you drop every odd-numbered pointer by moving the 128 even-numbered pointers to the beginning of the array. When the array of pointers fills up again, you repeat the procedure. At every such point you double the step between the list elements whose addresses end up in the array of pointers. Initially you'd place every element's address there, then every other's, then of one out of four and so on.
You end up with an array of pointers to the list elements spaced apart by the list length / 256.
If the list is singly-linked, locating i-th element from the beginning or the end of it is reduced to searching in 1/256th of the list.
If the list is sorted, you can perform binary search on the array to locate the bin (the 1/256th portion of the list) where to look further.

What is the need for asymmetric linked list

I am studying data structures. I have come across Asymmetric linked list which states that it is a special type of double linked list in which
1. next link points to next node address
2. prev link points to current node address itself
But I wonder,
1. what are the advantages we get by designing such linked list?
2. what kind of applications this would be suitable for?
Could anyone kindly explain more on Asymmetric linked list. I googled but I could not find relevent answers. Thank you.
Source :http://en.wikipedia.org/wiki/Doubly_linked_list#Asymmetric_doubly-linked_list
I agree the Wiki page is misleading. Here is the difference between LL and ALL:
Open Linked List:
node.next = nextNode
node.prev = prevNode
Asymmetric Linked List:
node.next = nextNode
node.prev = prevNode.next
Note the difference prevNode vs prevNode.next.
While pointing to a pointer within node still preserves the ability to traverse list backwards (you can get prevNode address by subtracting from prevNode.next) it may simplify insertion and deletion operations on the list, especially on the start element.
Given a node pointer from a double linked list, we can traverse all the nodes by the 'prev' and 'next', while a single linked list cannot do that if the pointer provided didn;t point to the first node.
E.g, delete a node from linked list. With single linked list, you have to traverse the list from head to find the specific node, and also need to record the prev node against the specific node, which causes the time complexity O(n). However, with double linked list, you can perform the delete with the specific node with the constant time.
In short, given a specific node, for single linked list, if we need to use its prev node information, the traverse wiht O(n) from the head is inevitable, while double lined list doesn't.
By the way, list in STL and LinkedList in Java are implemented with double linked list.
Because a picture worth thousands words :
As you can see, "previous" field is referencing "next", rather than previous element itself. This make little difference between nodes, except for first element : the previous field can point to the head rather than pointing to the last element (circular list) or be null.
The main advantage is for insertion and deletion : you don't need to take care of head and check if element is first one. Just having a single pointer to an element is enough to perform an insert or a delete to the list.
One disadvantage vs circular list : the only way to get last element (eg: to implement some "add last" operation) is to loop through the whole list.
You also lose the ability to loop through the list in reverse way (because no previous pointer), except if all elements have same size and you are allowed to do pointer arithmetic (as it is in C/C++).

Linked Lists and Sentinal node

So i have been asked in my homework to merge-sort 2 different sorted circular linked lists without useing a sentinal node allso the lists can be empty, my questsion is what is a sentinal node in the first place?
a sentinal node is a node that contains no real data - it's just there for the convenience of the implementation.
Thus a list with 4 real elements might have one or more extra nodes, making a total of 5 or 6 nodes.
Those extra nodes might be place holders (e.g. marking where you started the merge), pseudo-nodes indicating the head of the list, or anything else the algorithm designer can think up.
A sentinel node is a node that you add to your code to avoid handling degeneracies with special code. For merge sort for example, you can add a node with value = INFINITY to the end of both lists that you want to merge, this guarantees that once you hit the end of a list you can't go beyond that because the value is always greater (or equal) to the values in the other list.
So if you are not using a sentinel, you have to write code to handle this. In your merge routine, you should check that you've reached the end..
Sentinel node is a traversal path terminator in linked lists and tree. It doesn't hold or reference any data managed by the data structure. One of the benefit is to Reduce algorithmic complexity and code size.In your case the complexity,code size will be increased and speed of operation will be decreased.

Hashtable with doubly linked lists?

Introduction to Algorithms (CLRS) states that a hash table using doubly linked lists is able to delete items more quickly than one with singly linked lists. Can anybody tell me what is the advantage of using doubly linked lists instead of single linked list for deletion in Hashtable implementation?
The confusion here is due to the notation in CLRS. To be consistent with the true question, I use the CLRS notation in this answer.
We use the hash table to store key-value pairs. The value portion is not mentioned in the CLRS pseudocode, while the key portion is defined as k.
In my copy of CLR (I am working off of the first edition here), the routines listed for hashes with chaining are insert, search, and delete (with more verbose names in the book). The insert and delete routines take argument x, which is the linked list element associated with key key[x]. The search routine takes argument k, which is the key portion of a key-value pair. I believe the confusion is that you have interpreted the delete routine as taking a key, rather than a linked list element.
Since x is a linked list element, having it alone is sufficient to do an O(1) deletion from the linked list in the h(key[x]) slot of the hash table, if it is a doubly-linked list. If, however, it is a singly-linked list, having x is not sufficient. In that case, you need to start at the head of the linked list in slot h(key[x]) of the table and traverse the list until you finally hit x to get its predecessor. Only when you have the predecessor of x can the deletion be done, which is why the book states the singly-linked case leads to the same running times for search and delete.
Additional Discussion
Although CLRS says that you can do the deletion in O(1) time, assuming a doubly-linked list, it also requires you have x when calling delete. The point is this: they defined the search routine to return an element x. That search is not constant time for an arbitrary key k. Once you get x from the search routine, you avoid incurring the cost of another search in the call to delete when using doubly-linked lists.
The pseudocode routines are lower level than you would use if presenting a hash table interface to a user. For instance, a delete routine that takes a key k as an argument is missing. If that delete is exposed to the user, you would probably just stick to singly-linked lists and have a special version of search to find the x associated with k and its predecessor element all at once.
Unfortunately my copy of CLRS is in another country right now, so I can't use it as a reference. However, here's what I think it is saying:
Basically, a doubly linked list supports O(1) deletions because if you know the address of the item, you can just do something like:
x.left.right = x.right;
x.right.left = x.left;
to delete the object from the linked list, while as in a linked list, even if you have the address, you need to search through the linked list to find its predecessor to do:
pred.next = x.next
So, when you delete an item from the hash table, you look it up, which is O(1) due to the properties of hash tables, then delete it in O(1), since you now have the address.
If this was a singly linked list, you would need to find the predecessor of the object you wish to delete, which would take O(n).
However:
I am also slightly confused about this assertion in the case of chained hash tables, because of how lookup works. In a chained hash table, if there is a collision, you already need to walk through the linked list of values in order to find the item you want, and thus would need to also find its predecessor.
But, the way the statement is phrased gives clarification: "If the hash table supports deletion, then its linked lists should be doubly linked so that we can delete an item quickly. If the lists were only singly linked, then to delete element x, we would first have to find x in the list T[h(x.key)] so that we could update the next attribute of x’s predecessor."
This is saying that you already have element x, which means you can delete it in the above manner. If you were using a singly linked list, even if you had element x already, you would still have to find its predecessor in order to delete it.
I can think of one reason, but this isn't a very good one. Suppose we have a hash table of size 100. Now suppose values A and G are each added to the table. Maybe A hashes to slot 75. Now suppose G also hashes to 75, and our collision resolution policy is to jump forward by a constant step size of 80. So we try to jump to (75 + 80) % 100 = 55. Now, instead of starting at the front of the list and traversing forward 85, we could start at the current node and traverse backwards 20, which is faster. When we get to the node that G is at, we can mark it as a tombstone to delete it.
Still, I recommend using arrays when implementing hash tables.
Hashtable is often implemented as a vector of lists. Where index in vector is the key (hash).
If you don't have more than one value per key and you are not interested in any logic regarding those values a single linked list is enough. A more complex/specific design in selecting one of the values may require a double linked list.
Let's design the data structures for a caching proxy. We need a map from URLs to content; let's use a hash table. We also need a way to find pages to evict; let's use a FIFO queue to track the order in which URLs were last accessed, so that we can implement LRU eviction. In C, the data structure could look something like
struct node {
struct node *queueprev, *queuenext;
struct node **hashbucketprev, *hashbucketnext;
const char *url;
const void *content;
size_t contentlength;
};
struct node *queuehead; /* circular doubly-linked list */
struct node **hashbucket;
One subtlety: to avoid a special case and wasting space in the hash buckets, x->hashbucketprev points to the pointer that points to x. If x is first in the bucket, it points into hashbucket; otherwise, it points into another node. We can remove x from its bucket with
x->hashbucketnext->hashbucketprev = x->hashbucketprev;
*(x->hashbucketprev) = x->hashbucketnext;
When evicting, we iterate over the least recently accessed nodes via the queuehead pointer. Without hashbucketprev, we would need to hash each node and find its predecessor with a linear search, since we did not reach it via hashbucketnext. (Whether that's really bad is debatable, given that the hash should be cheap and the chain should be short. I suspect that the comment you're asking about was basically a throwaway.)
If the items in your hashtable are stored in "intrusive" lists, they can be aware of the linked list they are a member of. Thus, if the intrusive list is also doubly-linked, items can be quickly removed from the table.
(Note, though, that the "intrusiveness" can be seen as a violation of abstraction principles...)
An example: in an object-oriented context, an intrusive list might require all items to be derived from a base class.
class BaseListItem {
BaseListItem *prev, *next;
...
public: // list operations
insertAfter(BaseListItem*);
insertBefore(BaseListItem*);
removeFromList();
};
The performance advantage is that any item can be quickly removed from its doubly-linked list without locating or traversing the rest of the list.

Resources