I am studying from my course book on Data Structures by Seymour Lipschutz and I have come across a point I don’t fully understand..
Binary Search Algorithm assumes that one has direct access to middle element in the list. This means that the list must be stored in some typeof linear array.
I read this and also recognised that in Python you can have access to the middle element at all times. Then the book goes onto say:
Unfortunately, inserting an element in an array requires elements to be moved down the list, and deleting an element from an array requires element to be moved up the list.
How is this a Drawback ?
Won’t we still be able to access the middle element by dividing the length of array by 2?
In the case where the array will not be modified, the cost of insertion and deletion are not relevant.
However, if an array is to be used to maintain a sorted set of non-fixed items, then insertion and deletion costs are relevant. In this case, binary search can be used to find items (possibly for deletion) and/or find where new items should be inserted. The drawback is that insertion and deletion require movement of other elements.
Python's bisect module provides binary search functionality that can be used for locating insertion points for maintaining sorted order. The drawback mentioned applies.
In some cases, a binary search tree may be a preferable alternative to a sorted array for maintaining a sorted set of non-fixed items.
It seems that author compares array-like structures and linked list
The first (array, Python and Java list, C++ vector) allows fast and simple access to any element by index, but appending, inserting or deletion might cause memory redistribution.
For the second we cannot address i-th element directly, we need to traverse list from the beginning, but when we have element - we can insert or delete quickly.
Related
Where can one use a (doubly-linked list) Positional List ADT? When the developer wants O(n) memory and O(1) (non-amortized behaviors) to an arbitrary position in the list? I would like to see an example of using a positional list. What would be the advantage of using a positional list over using an array-based sequence?
If your program often needs to add new elements to or delete elements from your data collection a list is likely to be a better choice than an array.
Deleting the element at position N of an array require a copy operation on all elements after element N. In principle:
Arr[N] = Arr[N+1]
Arr[N+1] = Arr[N+2]
...
A similar copy is needed when inserting an new element, i.e. to make room for the new element.
If your program frequently adds/deletes element, the many copy operations may hurt performance.
As a part of these operations the position of existing elements changes, i.e. an element at position 1000 will be at either position 999 or 1001 after an element is deleted/added at position 50.
This can be a problem if some part of your program has searched for a specific element and saved its position (e.g. position 1000). After an element delete/add operation, the saved position is no longer valid.
A (doubly-linked) list "solves" the 3 problems described. With a list you can add/delete elements without copying existing elements to new positions. Consequently, the position of a specific element (e.g. a pointer to an element) will still be valid after an add/delete operation.
To summarize: If your program (frequently) adds or deletes random located elements and if your program requires that position information isn't affected by add/delete operations, a list may be a better choice than an array.
Positional Lists
So when working with arrays indexes are great in locating positions for insertion and deletion. However, indexes are not great for linked structures (like a linked list), mainly because even if we have the index. We still have to traverse all previous nodes in the liked structure. That means an index based deletion or insertion in a linked list would run O(N) time(no bueno). It is a general rule of thumb that for data structure operations we want them to run either O(1) or O(log n). Also, indexes are not good at describing there positions relative to other nodes.
What do Positions allow us to do?
Positions allow us to achieve constant time insertions and deletions at arbitrary locations within our liked structure (kind cool, right??).
They also allow us to describe the element relative to other elements.
Essentially, a Position gives us a reference to a memory address which we can then use for constant time insertions and deletions. Your textbook image doesn't show a validate method, However, I would assume that the implementation does. So just be aware that you will need a utility method validate to verify that the position is in the linked structure.
What does a Position look like?
in reality a Position is a ADT (abstract data type) and in Java we formalize ADTs with interfaces, like so:
public interface Position <E>{
E getElement()throws IllegalStateException;
}
A Position is just an abstraction that gets implemented on a Node within a linked structure. Why do this? Well, it gives our code greater flexibility and better abstraction.
So to implement a Position for a linked list node, it would look something like this:
private static class Node<E> implements Position<E> {
// ALL THE OTHER METHODS WOULD BE HERE
}
The Node class is private and static because it is assumed to be nested inside the linked list class. Then ultimately we use this Position as a data type when dealing with all the other methods inside your Positional Linked list.
Real world example
Well you could use this method to create a parse tree for a parser. You could have a binary tree that implements and uses the Position interface for all of its methods and then use the tree for parsing
A linear data structure traverses the data elements sequentially, in which only one data element can directly be reached. Ex: Arrays, Linked Lists.
But in doubly linked list we can reach two data elements using previous pointer and next pointer.
So can we say that doubly linked list is a non linear data structure?
Correct me if I am wrong.
Thank you.
Non-linear data structures are those data-structure in which the elements appear in a non-linear fashion,which requires two or more than two-dimensional representation . The elements may OR mayn't(mostly) be stored in contiguous memory locations,rather in any order/non-linearly as if you have skipped the elements in between. Accessing the elements are also done in an out-of-order pattern.
Example :- A Tree, here one may iterate from root to right child,to its right child,... and so on---thereby skipping all the left nodes.
But, in doubly linked list, you have to move sequentially(linearly) only, to move forward(using forward pointer) or backward(using previous pointer).
You can't jump from any element in the list to any distant element without traversing the intermediary elements.
Hence, doubly-linked list is a linear data structure. In a linear data structure, the elements are arranged in a linear fashion(that is,one-dimensional representation).
You are wrong; 2 justifications:
While you can get to 2 elements from any node, one of them was the one you used to get to this node, so you can only get to one new node from each.
It is still linear in that it has to be traversed sequentially, or in a line.
It is still sequential: you need to go over some elements in the list to get to a particular element, compared to an array where you can randomly access each element.
However, you can go linearly forwards or backwards, which may optimize the search.
linked list is basically a linear data Structure because it stores data in a linear fashion. A linear data Structure is what which stores data in a linear format and the traversing is in sequential manner and not in zigzag way.
It depends on where you intend to apply linked lists. If you based it on storage, a linked list is considered non-linear. On the other hand, if you based it on access strategies, then a linked list is considered linear.
if we have our elements in a sorted circular double linked list the order of operations (insert delete Max Min successor predecessor) are the same or even better than the binary search tree . so why we use them ?
is it because data structure authors want to familiarize reader with general concept of tree as a data structure with some simple examples ?
i have read some same questions but the questions were (inconsiderately !) asked with arrays instead of linked lists and answers were not useful for linked lists! since most of them addressed the problem of shifting the elements in the array for insertion.
A linked list is not "addressable", in a sense that, if you want to access the element in the middle of the list, for example to do binary search, you will have to walk the list. That is to say the performance of list.get(index) is O(n). If you back it up with any data structure that gives you O(1) performance, it will be an array in the end. Then we will be back to the problem of allocating extra space and shifting the elements, which is not as efficient as binary search trees.
Actually the binary search in double circular linked list cannot be done since in a binary search we need the middle element but we cannot access the middle element in a linked list unless we pay theta(n/2) and so on ( half of the first half (1/4) or half of the second half (3/4) ) .
but the idea of making a binary search tree stems from this idea.we almost keep the middle of each part of our data to use it for searching and other ends .
In an interview today I got asked the question.
Apart from answering reversing the list and both forward and backward traversal there was something "fundamental" in it that the interviewer kept stressing. I gave up and of course after interview did a bit of research. It seems that insertion and deletion are more efficient in doubly linked list than singly linked list. I am not quite sure how it can be more efficient for a doubly linked list since it is obvious that more references are required to change.
Can anybody explain the secret behind? I honestly did a quite a bit of research and failed to understand with my main trouble being the fact that a O(n) searching is still needed for the double linked list.
Insertion is clearly less work in a singly-linked list, as long as you are content to always insert at the head or after some known element. (That is, you cannot insert before a known element, but see below.)
Deletion, on the other hand, is trickier because you need to know the element before the element to be deleted.
One way of doing this is to make the delete API work with the predecessor of the element to be deleted. This mirrors the insert API, which takes the element which will be the predecessor of the new element, but it's not very convenient and it's hard to document. It's usually possible, though. Generally speaking, you arrive at an element in a list by traversing the list.
Of course, you could just search the list from the beginning to find the element to be deleted, so that you know what its predecessor was. That assumes that the delete API includes the head of the list, which is also inconvenient. Also, the search is stupidly slow.
The way that hardly anyone uses, but which is actually pretty effective, is to define a singly-linked list iterator to be the pointer to the element preceding the current target of the iterator. This is simple, only one indirection slower than using a pointer directly to the element, and makes both insertion and deletion fast. The downside is that deleting an element may invalidate other iterators to list elements, which is annoying. (It doesn't invalidate the iterator to the element being deleted, which is nice for traversals which delete some elements, but that's not much compensation.)
If deletion is not important, perhaps because the datastructures are immutable, singly-linked lists offer another really useful property: they allow structure-sharing. A singly-linked list can happily be the tail of multiple heads, something which is impossible for a doubly-linked list. For this reason, singly-linked lists have traditionally been the simple datastructure of choice for functional languages.
Here is some code that made it clearer to me... Having:
class Node{
Node next;
Node prev;
}
DELETE a node in a SINGLE LINKED LIST -O(n)-
You don't know which is the preceeding node so you have to traverse the list until you find it:
deleteNode(Node node){
prevNode = tmpNode;
tmpNode = prevNode.next;
while (tmpNode != null) {
if (tmpNode == node) {
prevNode.next = tmpNode.next;
}
prevNode = tmpNode;
tmpNode = prevNode.next;
}
}
DELETE a node in a DOUBLE LINKED LIST -O(1)-
You can simply update the links like this:
deleteNode(Node node){
node.prev.next = node.next;
node.next.prev = node.prev;
}
Here are my thoughts on Doubly-Linked List:
You have ready access\insert on both ends.
it can work as a Queue and a Stack at the same time.
Node deletion requires no additional pointers.
You can apply Hill-Climb traversal since you already have access on both ends.
If you are storing Numerical values, and your list is sorted, you can keep a pointer/variable for median, then Search operation can be highly optimal using Statistical approach.
If you are going to delete an element in a linked list, you will need to link the previous element to the next element. With a doubly linked list you have ready access to both elements because you have links to both of them.
This assumes that you already have a pointer to the element you need to delete and there is no searching involved.
'Apart from answering reversing the list and both forward and backward traversal there was something "fundamental"'.
Nobody seem to have mentioned: in a doubly linked list it is possible to reinsert a deleted element just by having a pointer to the deleted element. See Knuth's Dancing Links paper. I think that's pretty fundamental.
Because doubly linked lists have immediate access to both the front and end
of the list, they can insert data on either side at O(1) as well as delete data on either side at O(1). Because doubly linked lists can insert data at the end in O(1) time and delete data from the front in O(1) time, they make the perfect underlying data structure for a queue. Queeus are lists of items
in which data can only be inserted at the end and removed from the beginning.
queues are an example of an abstract data type, and
that we are able to use an array to implement them under the hood.
Now, since queues insert at the end and delete from the beginning, arrays
are only so good as the underlying data structure. While arrays are O(1) for
insertions at the end, they’re O(N) for deleting from the beginning.
A doubly linked list, on the other hand, is O(1) for both inserting at the end
and for deleting from the beginning. That’s what makes it a perfect fit for
serving as the queue’s underlying data structure.
The doubly linked list is used in LRU cache design since we need to remove the least recently items frequently. The deletion operation is faster. To delete the least recently used item, we just delete if from end, to a new item to add cache, we just append a new node to the beginning of the list
Doubly Linked List is used in navigation systems where front and back navigation is required. It is also used by the browser to implement backward and forward navigation of visited web pages that is a back and forward button.
Singly Linked List vs Doubly Linked List vs Dynamic Arrays:
When comparing the three main data structures, Doubly Linked Lists are most efficient in all major tasks and operations when looking at time complexity. For Doubly Linked Lists, it operates at constant time for all operations except only access by index, where it operated at linear time (n) as it needs to iterate through each node to get to the required index. When it comes to Insert, Remove, First, Last, Concatenation and Count, Doubly Linked list operates at constant time where Dynamic Arrays operate at linear time (n).
In terms of space complexity, Dynamic Arrays stores only elements therefore constant time complexity, singly linked lists stores the successor of each element therefore linear space complexity (n), and worst of all doubly linked list stores the predecessor and successor of each element and therefore also linear space complexity but (2*n).
Unless you have extremely limited resources / space then perhaps either Dynamic arrays or Singly linked lists are better, however, nowadays, space and resources are more and more abundant and so doubly linked lists are far better with the cost of more space.
Doubly Linked list is more effective than the Singly linked list when the location of the element to be deleted is given. Because it is required to operate on "4" pointers only & "2" when the element to be deleted is at the first node or at the last node.
struct Node {
int Value;
struct Node *Fwd;
struct Node *Bwd;
);
Only the below line of code will be enough to delete the element, if the element to be deleted is not in the first or last node.
X->Bwd->Fwd = X->Fwd; X->Fwd->Bwd = X->Bwd;
I was looking for some simple implemented data structure which gets my needs fulfilled in least possible time (in worst possible case) :-
(1)To pop nth element (I have to keep relative order of elements intact)
(2)To access nth element .
I couldn't use array because it can't pop and i dont want to have a gap after deleting ith element . I tried to remove the gap , by exchanging nth element with next again with next untill last but that proves time ineffecient though array's O(1) is unbeatable .
I tried using vector and used 'erase' for popup and '.at()' for access , but even this is not cheap for time effeciency though its better than array .
What you can try is skip list - it support the operation you are requesting in O(log(n)). Another option would be tiered vector that is just slightly easier to implement and takes O(sqrt(n)). both structures are quite cool but alas not very popular.
Well , tiered vector implemented on array would i think best fit your purpose . Though the tiered vector concept may be knew and little tricky to understand at first but then once you get it , it opens lot of question and you get a handy weapon to tackle many question's data structure part very effeciently . So it is recommended that you master tiered vectors implementation.
An array will give you O(1) lookup but O(n) delete of the element.
A list will give you O(n) lookup bug O(1) delete of the element.
A binary search tree will give you O(log n) lookup with O(1) delete of the element. But it doesn't preserve the relative order.
A binary search tree used in conjunction with the list will give you the best of both worlds. Insert a node into both the list (to preserve order) and the tree (fast lookup). Delete will be O(1).
struct node {
node* list_next;
node* list_prev;
node* tree_right;
node* tree_left;
// node data;
};
Note that if the nodes are inserted into the tree using the index as the sort value, you will end up with another linked list pretending to be a tree. The tree can be balanced however in O(n) time once it is built which you would only have to incur once.
Update
Thinking about this more this might not be the best approach for you. I'm used to doing lookups on the data itself not its relative position in a set. This is a data centric approach. Using the index as the sort value will break as soon as you remove a node since the "higher" indices will need to change.
Warning: Don't take this answer seriously.
In theory, you can do both in O(1). Assuming this are the only operations you want to optimize for. The following solution will need lots of space (and it will leak space), and it will take long to create the data structure:
Use an array. In every entry of the array, point to another array which is the same, but with that entry removed.