What are some uses for linked lists? - data-structures

Do linked lists have any practical uses at all. Many computer science books compare them to arrays and say the main advantage is that they are mutable. However, most languages provide mutable versions of arrays. So do linked lists have any actual uses in the real world, or are they just part of computer science theory?

They're absolutely precious (in both the popular doubly-linked version and the less-popular, but simpler and faster when applicable!, single-linked version). For example, inserting (or removing) a new item in a specified "random" spot in a "mutable version of an array" (e.g. a std::vector in C++) is O(N) where N is the number of items in the array, because all that follow (on average half of them) must be shifted over, and that's an O(N) operation; in a list, it's O(1), i.e., constant-time, if you already have e.g. the pointer to the "previous" item. Big-O differences like this are absolutely huge -- the difference between a real-world usable and scalable program, and a toy, "homework"-level one!-)

Linked lists have many uses. For example, implementing data structures that appear to the end user to be mutable arrays.
If you are using a programming language that provides implementations of various collections, many of those collections will be implemented using linked lists. When programming in those languages, you won't often be implementing a linked list yourself but it might be wise to understand them so you can understand what tradeoffs the libraries you use are making. In other words, the set "just part of computer science theory" contains elements that you just need to know if you are going to write programs that just work.

The main Applications of Linked Lists are
For representing Polynomials
It means in addition/subtraction /multipication.. of two polynomials.
Eg:p1=2x^2+3x+7 and p2=3x^3+5x+2
p1+p2=3x^3+2x^2+8x+9
In Dynamic Memory Management
In allocation and releasing memory at runtime.
*In Symbol Tables
in Balancing paranthesis
Representing Sparse Matrix
Ref:-
http://www.cs.ucf.edu/courses/cop3502h.02/linklist3.pdf

So do linked lists have any actual uses in the real world,
A Use/Example of Linked List (Doubly) can be Lift in the Building.
- A person have to go through all the floor to reach top (tail in terms of linked list).
- A person can never go to some random floor directly (have to go through intermediate floors/Nodes).
- A person can never go beyond the top floor (next to the tail node is assigned null).
- A person can never go beyond the ground/last floor (previous to the head node is assigned null in linked list).

Yes of course it's useful for many reasons.
Anytime for example that you want efficient insertion and deletion from the list. To find a place of insertion you have an O(N) search, but to do an insertion if you already have the correct position it is O(1).
Also the concepts you learn from working with linked lists help you learn how to make tree based data structures and many other data structures.

A primary advantage to a linked list as opposed to a vector is that random-insertion time is as simple as decoupling a pair of pointers and recoupling them to the new object (this is of course, slightly more work for a doubly-linked list). A vector, on the other hand generally reorganizes memory space on insertions, causing it to be significantly slower. A list is not as efficient, however, at doing things like adding on the end of the container, due to the necessity to progress all the way through the list.

An Immutable Linked List is the most trivial example of a Persistent Data Structure, which is why it is the standard (and sometimes even only) data structure in many functional languages. Lisp, Scheme, ML, Haskell, Scala, you name it.

Linked Lists are very useful in dynamic memory allocation. These lists are used in operating systems. insertion and deletion in linked lists are very useful. Complex data structures like tree and graphs are implemented using linked lists.

Arrays that grow as needed are always just an illusion, because of the way computer memory works. Under the hood, it's just a continous block of memory that has to be reallocated when enough new elements have been added. Likewise if you remove elements from the array, you'll have to allocate a new block of memory, copy the array and release the previous block to reclaim the unused memory. A linked list allows you to grow and shrink a list of elements without having to reallocate the rest of the list.

Linked lists are useful because elements can be efficiently spliced and removed in the middle as others noted. However a downside to linked lists are poor locality of reference. I prefer not using lists for this reason unless I have an explicit need for the capabilities.

Related

Do linked lists have a practical performance advantage in eager functional languages?

I'm aware that in lazy functional languages, linked lists take on generator-esque semantics, and that under an optimizing compiler, their overhead can be completely removed when they're not actually being used for storage.
But in eager functional languages, their use seems no less heavy, while optimizing them out seems more difficult. Is there a good performance reason that languages like Scheme use them over flat arrays as their primary sequence representation?
I'm aware of the obvious time complexity implications of using singly-linked lists; I'm more thinking about the in-practice performance consequences of using eager singly-linked lists as a primary sequence representation, compiler optimizations considered.
TL; DR: No performance advantage!
The first Lisp had cons (linked list cell) as only data structure and it used it for everything. Both Common Lisp and Scheme have vectors today, but for functional style it's not a good match. The linked list can have one recursive step add zero or more elements in front of an accumulator and in the end you have a list that was made with sharing between the iterations. The operation might do more than one recursion making several versions all sharing the tail. I would say sharing is the most important aspect of the linked list. If you make a minimax algorithm and store the state in a linked list you can change the state without having to copy the unchanged parts of the state.
Creator of C++ Bjarne Stroustrup mentions in a talk that the penalty of having the data structure scrambled and double the size like in a linked list is easily outperformed even if you insert in order and need to move half the elements in the data structure. Keep in mind these were doubly linked lists and mutation inserts, but he mentions that most of the time was following the pointers linearly, getting all the CPU cache misses, to find the correct spot and thus for every O(n) search in a sorted list a vector would be better.
If you have a program where you do many inserts on a list which is not in front then perhaps a tree is a better choice, which you can do in CL and Scheme with cons. In fact all of Chris Okasaki purly functional data structures can be implemented with cons. Most "mutable" structures in Haskell are implemented similar to it.
If you are suffering from performance problems in Scheme and after profiling you find out you should try to replace a linked list operation with an array one there is nothing standing in the way of that. In the end all algorithm choices have pros and cons. Hard computations are hard in any language.

What are appropriate applications for a linked (doubly as well) list?

I have a question about fundamentals in data structures.
I understand that array's access time is faster than a linked list. O(1)- array vs O(N) -linked list
But a linked list beats an array in removing an element since there is no shifting needing O(N)- array vs O(1) -linked list
So my understanding is that if the majority of operations on the data is delete then using a linked list is preferable.
But if the use case is:
delete elements but not too frequently
access ALL elements
Is there a clear winner? In a general case I understand that the downside of using the list is that I access each node which could be on a separate page while an array has better locality.
But is this a theoretical or an actual concern that I should have?
And is the mixed-type i.e. create a linked list from an array (using extra fields) good idea?
Also does my question depend on the language? I assume that shifting elements in array has the same cost in all languages (at least asymptotically)
Singly-linked lists are very useful and can be better performance-wise relative to arrays if you are doing a lot of insertions/deletions, as opposed to pure referencing.
I haven't seen a good use for doubly-linked lists for decades.
I suppose there are some.
In terms of performance, never make decisions without understanding relative performance of your particular situation.
It's fairly common to see people asking about things that, comparatively speaking, are like getting a haircut to lose weight.
Before writing an app, I first ask if it should be compute-bound or IO-bound.
If IO-bound I try to make sure it actually is, by avoiding inefficiencies in IO, and keeping the processing straightforward.
If it should be compute-bound then I look at what its inner loop is likely to be, and try to make that swift.
Regardless, no matter how much I try, there will be (sometimes big) opportunities to make it go faster, and to find them I use this technique.
Whatever you do, don't just try to think it out or go back to your class notes.
Your problem is different from anyone else's, and so is the solution.
The problem with a list is not just the fragmentation, but mostly the data dependency. If you access every Nth element in array you don't have locality, but the accesses may still go to memory in parallel since you know the address. In a list it depends on the data being retrieved, and therefore traversing a list effectively serializes your memory accesses, causing it to be much slower in practice. This of course is orthogonal to asymptotic complexities, and would harm you regardless of the size.

Linked List vs Vector

Over the past few days I have been preparing for my very first phone interview for a software development job. In researching questions I have come up with this article.
Every thing was great until I got to this passage,
"When would you use a linked list vs. a vector? "
Now from experience and research these are two very different data structures, a linked list being a dynamic array and a vector being a 2d point in space. The only correlation I can see between the two is if you use a vector as a linked list, say myVector(my value, pointer to neighbor)
Thoughts?
Vector is another name for dynamic arrays. It is the name used for the dynamic array data structure in C++. If you have experience in Java you may know them with the name ArrayList. (Java also has an old collection class called Vector that is not used nowadays because of problems in how it was designed.)
Vectors are good for random read access and insertion and deletion in the back (takes amortized constant time), but bad for insertions and deletions in the front or any other position (linear time, as items have to be moved). Vectors are usually laid out contiguously in memory, so traversing one is efficient because the CPU memory cache gets used effectively.
Linked lists on the other hand are good for inserting and deleting items in the front or back (constant time), but not particularly good for much else: For example deleting an item at an arbitrary index in the middle of the list takes linear time because you must first find the node. On the other hand, once you have found a particular node you can delete it or insert a new item after it in constant time, something you cannot do with a vector. Linked lists are also very simple to implement, which makes them a popular data structure.
I know it's a bit late for this questioner but this is a very insightful video from Bjarne Stroustrup (the inventor of C++) about why you should avoid linked lists with modern hardware.
https://www.youtube.com/watch?v=YQs6IC-vgmo
With the fast memory allocation on computers today, it is much quicker to create a copy of the vector with the items updated.
I don't like the number one answer here so I figured I'd share some actual research into this conducted by Herb Sutter from Microsoft. The results of the test was with up to 100k items in a container, but also claimed that it would continue to out perform a linked list at even half a million entities. Unless you plan on your container having millions of entities, your default container for a dynamic container should be the vector. I summarized more or less what he says, but will also link the reference at the bottom:
"[Even if] you preallocate the nodes within a linked list, that gives you half the performance back, but it's still worse [than a vector]. Why? First of all it's more space -- The per element overhead (is part of the reason) -- the forward and back pointers involved within a linked list -- but also (and more importantly) the access order. The linked list has to traverse to find an insertion point, doing all this pointer chasing, which is the same thing the vector was doing, but what actually is occurring is that prefetchers are that fast. Performing linear traversals with data that is mapped efficiently within memory (allocating and using say, a vector of pointers that is defined and laid out), it will outperform linked lists in nearly every scenario."
https://youtu.be/TJHgp1ugKGM?t=2948
Use vector unless "data size is big" or "strong safety guarantee is essential".
data size is big
:- vector inserting in middle take linear time(because of the need to shuffle things around),but other are constant time operation (like traversing to nth node).So there no much overhead if data size is small.
As per "C++ coding standards Book by Andrei Alexandrescu and Herb Sutter"
"Using a vector for small lists is almost always superior to using list. Even though insertion in the middle of the sequence is a linear-time operation for vector and a constant-time operation for list, vector usually outperforms list when containers are relatively small because of its better constant factor, and list's Big-Oh advantage doesn't kick in until data sizes get larger."
strong safety guarantee
List provide strong safety guaranty.
http://www.cplusplus.com/reference/list/list/insert/
As a correction on the Big O time of insertion and deletion within a linked list, if you have a pointer that holds the position of the current element, and methods used to move it around the list, (like .moveToStart(), .moveToEnd(), .next() etc), you can remove and insert in constant time.

Data structure name: combination array/linked list

I have come up with a data structure that combines some of the advantages of linked lists with some of the advantages of fixed-size arrays. It seems very obvious to me, and so I'd expect someone to have thought of it and named it already. Does anyone know what this is called:
Take a small fixed-size array. If the number of elements you want to put in your array is greater than the size of the array, add a new array and whatever pointers you like between the old and the new.
Thus you have:
Static array
—————————————————————————
|1|2|3|4|5|6|7|8|9|a|b|c|
—————————————————————————
Linked list
———— ———— ———— ———— ————
|1|*->|2|*->|3|*->|4|*->|5|*->NULL
———— ———— ———— ———— ————
My thing:
———————————— ————————————
|1|2|3|4|5|*->|6|7|8|9|a|*->NULL
———————————— ————————————
Edit: For reference, this algorithm provides pretty poor worst-case addition/deletion performance, and not much better average-case. The big advantage for my scenario is the improved cache performance for read operations.
Edit re bounty: Antal S-Z's answer was so complete and well-researched that I wanted to provide em with a bounty for it. Apparently Stack Overflow doesn't let me accept an answer as soon as I've offered a bounty, so I'll have to wait (admittedly I am abusing the intention bounty system somewhat, although it's in the name of rewarding someone for an excellent answer). Of course, if someone does manage to provide a better answer, more power to them, and they can most certainly have the bounty instead!
Edit re names: I'm not interested in what you'd call it, unless you'd call it that because that's what authorities on the subject would call it. If it's a name you just came up with, I'm not interested. What I want is a name that I can look up in text books and with Google. (Also, here's a tip: Antal's answer is what I was looking for. If your answer isn't "unrolled linked list" without a very good reason, it's just plain wrong.)
It's called an unrolled linked list. There appear to be a couple of advantages, one in speed and one in space. First, if the number of elements in each node is appropriately sized (e.g., at most the size of one cache line), you get noticeably better cache performance from the improved memory locality. Second, since you have O(n/m) links, where n is the number of elements in the unrolled linked list and m is the number of elements you can store in any node, you can also save an appreciable amount of space, which is particularly noticeable if each element is small. When constructing unrolled linked lists, apparently implementations will try to generally leave space in the nodes; when you try to insert in a full node, you move half the elements out. Thus, at most one node will be less than half full. And according to what I can find (I haven't done any analysis myself), if you insert things randomly, nodes tend to actually be about three-quarters full, or even fuller if operations tend to be at the end of the list.
And as everyone else (including Wikipedia) is saying, you might want to check out skip lists. Skip lists are a nifty probabilistic data structure used to store ordered data with O(log n) expected running time for insert, delete, and find. It's implemented by a "tower" of linked lists, each layer having fewer elements the higher up it is. On the bottom, there's an ordinary linked list, having all the elements. At each successive layer, there are fewer elements, by a factor of p (usually 1/2 or 1/4). The way it's built is as follows. Each time an element is added to the list, it's inserted into the appropriate place in the bottom row (this uses the "find" operation, which can also be made fast). Then, with probability p, it's inserted into the appropriate place in the linked list "above" it, creating that list if it needs to; if it was placed in a higher list, then it will again appear above with probability p. To query something in this data structure, you always check the top lane, and see if you can find it. If the element you see is too large, you drop to the next lowest lane and start looking again. It's sort of like a binary search. Wikipedia explains it very well, and with nice diagrams. The memory usage is going to be worse, of course, and you're not going to have the improved cache performance, but it is generally going to be faster.
References
“Unrolled Linked List”, http://en.wikipedia.org/wiki/Unrolled_linked_list
“Unrolled Linked Lists”, Link
“Skip List”, http://en.wikipedia.org/wiki/Skip_list
The skip list lecture(s) from my algorithms class.
CDR coding (if you're old enough to remember Lisp Machines).
Also see ropes which is a generalization of this list/array idea for strings.
I would call this a bucket list.
While I don't know your task, I would strongly suggest you have a look at skip lists.
As for name, I'm thinking a bucket list would probably be most apropos
You can call it LinkedArrays.
Also, I would like to see the pseudo-code for the removeIndex operation.
What are the advantages of this data structure in terms of insertion and deletion?
Ex:
What if you want to add an element between 3 and 4? still have to do a shift, it takes O(N)
How do you find out the correct bucket for elementAt?
I agree with jer, you must take a look on skip list. It brings the advantages of Linked List and Arrays. The most of operations are done in O(log N)

What is the standard OCaml data structure with fastest iteration?

I'm looking for a container that provides fastest unordered iterations through the encapsulated elements. In other words, "add once, iterate many times".
Is there one among OCaml's standard modules that is fast enough (such that further optimization of it would be useless)? Or some kind of third-party GPL-ready ones?
AFAIK there's just one OCaml compiler, so the concept of being fast is more or less clear...
...But after I saw a couple of answers, it appears, it's not. Of course, there's a plenty of data structures that allow O(n) iteration through container of size n. But the task I'm solving is one of those, where difference between O(n) and O(2n) matters ;-).
I also see that Arrays and Lists provide unnecessary information about the order of elements added, which I don't need. Maybe in "functional world" there exists data structures such that can trade this information for a bit of iteration speed.
In C I would outright pick a plain array. The question is, what should I pick in OCaml?
You are unlikely to do better than built-in arrays and lists, since they are hand-coded in C, unless you bind to your own native implementation of an iterator. An array will behave almost exactly like an array in C (a contiguously allocated block of memory containing a sequence of element values), possibly with some extra pointer indirections due to boxing. List are implemented exactly how you would expect: as cells with a value and a "next" pointer. Arrays will give you the best locality for unboxed types (especially floats, which have a super-special unboxed implementation).
For information about the implementation of arrays and lists, see Section 18.3 of the OCaml manual and the files byterun/mlvalues.h, byterun/array.c, and byterun/alloc.c in the OCaml source code.
From the questioner: indeed, Array appeared to be the fastest solution. However it only outperformed List by 7%. Maybe it was because the type of an array element was not plain enough: it was an algebraic type. Hashtbl performed 4 times worse, as expected.
So, I will pick Array and I'm accepting this one. good.
To know for sure, you're going to have to measure. Based on the machine instructions the compiler is likely to generate, I would try an array, then a list.
Access to an array element requires a bounds check, address arithmetic, and a load
Access to the head of a list requires a load, a test for empty list, and a load at a known compile-time offset.
The details of which is faster probably depend on your application and what else is happening on your machine. They also depend on the type of elements; for example, if they are floating-point numbers, ocamlopt may be clever enough to make an unboxed array, which will save you a level of indirection.
Other common data structures like hash tables or balanced trees generally require that you allocate some context somewhere to keep track of where you are. With an array, keeping track requires only an integer index; with a list, keeping track requires a single pointer. I think this is going to be hard to beat in another data structure.
Finally please note that there may be only one OCaml compiler, but it has two back ends: bytecode and native code. Naturally if you care about this level of performance, you are using the native-code ocamlopt version. Right?
Please take measurements and edit the results into your question.
Don't forget about Bigarrays, they are most close to C arrays (just a flat piece of memory), but cannot contain arbitrary OCaml values. Also consider switching bounds checking off (unsafe_set/get). And of course you should profile first.
The array - a linear piece of memory with the items visited in sequential order - best utilises the CPU's L1 data cache.
All common data structures are iterable in O(n) time, so the differences between data structures will only be constant (and very probably not significant).
At least lists and arrays allow iteration without significant overhead. I can't think of a situation where that would not be fast enough.

Resources