I know the concept of linearity in data structures. But I want to know that what Data Structures are Linear and what are Non-Linear. Please show me the list of both types of data structures that we use commonly.
Here is an explanation that also gives some examples for both cases.
Linear: list, array
Non-Linear: tree, graph
In Linear Data Structures, data members are accessed sequentially.
Examples: Arrays, Linked Lists, Queues, Stacks, Double Linked Lists.
In Non-Linear Data Structures, a data member may have connections with several other data members; these structures follow no set sequence.
Examples: Graphs, Trees.
Related
Skiena's Algorithm Design Manual (3 ed, p. 204) refers to adjacency lists as opposed to general adjacency representations, defining them as assigning to each vertex a a singly linked list L_a with underlying set set(L_a) = {b | (x, b) <- edges, x == a}.
I'm surprised that Skiena presents the singly linked list as the definitive data structure implementing the collections L_a. My impression is that linked lists are generally losing favor compared with arrays and hash tables, because:
They are not cache-friendly to iterate over (like arrays are), and the gap between processor speed and main memory access has become more important. (For instance this video (7m) by Stroustrup.)
They don't bring much to the table, particularly when order isn't important. The advantage of linked lists over arrays is that they admit constant-time add and delete. But in the case where we don't care about order, these can be constant-time operations on arrays as well, using "swap and pop" for deletes. A hash table would have the additional advantage of constant-time search. My understanding is that hash tables cost more memory than a linked list or an array, but this consideration has become relatively less important. (Perhaps this claim isn't meaningful in the absence of a specific application.)
Other sources treat adjacency lists differently. For instance Wikipedia presents an implementation where the L_a are arrays. And in Stone's Algorithms for Functional Programming the L_a are unordered sets, implemented ultimately as Scheme lists (which in turn struck me as strange).
My Question: Is there a consideration I'm missing which gives singly linked lists a significant advantage in adjacency representations?
I add an earnest request that before you vote to close this question, or post a comment with an uncharitable tone, you ask yourself whether you are really helping this site achieve its goals by doing so.
I don't think there's any general agreement on singly-linked lists as the default representation of adjacency lists in most real-world use cases.
A singly-linked list is, however, pretty much the most restrictive implementation of an adjacency list you could have, so in a book about "Algorithm Design", it makes sense to think of adjacency lists in this representation unless you need something special from them, like random access, bidirectional iteration, binary search, etc.
When it comes to practical implementations of algorithms on explicit graphs (most implementations are on implicit graphs), I don't think singly-linked lists are usually a good choice.
My go-to adjacency list graph representation is a pair of parallel arrays:
Vertexes are numbered from 0 to n-1
There is an edge array that contains all of the edges sorted by their source vertex number. For an undirected graph, each edge appears in here twice. The source vertices often don't need to be stored in here.
There is a vertex array that stores, for each vertex, the end position of its edges in the edge array.
This is a nice, compact, cache-friendly representation that is easy to work with and requires only two allocations.
I can usually find an easy way to construct a graph like this in linear time by first filling the vertex array with counts, then changing the counts to start positions (shifted cumulative sum), and then populating the edge array, advancing those positions as edges are added.
I've been working on an implementation of a radix tree (for strings/char arrays), but I'm having somewhat of a dilemma figuring out how to store what tree nodes are children of a particular tree nodes.
I've seen linked list implementations used in Trie's (somewhat similar to radix trees) and possibly some radix trees (it's been a while since I've last researched this topic), but that seems like it'd perform very poorly especially if you have a set of data which contains lots of common prefixes.
Now I'm wondering would using another data structure (e.g. a Binary Search Tree) be a better design choice? I think I can see a very substantial speed improvement over a simple linked list (O(log(n)) vs. O(n)) when there is data with a large number of common prefixes, but could there be some substantial compromises to performance elsewhere?
In particular I'm worried about cases where there aren't a large number of common prefixes, or any other possible obstacles which may cause one to choose a linked list over a binary search tree.
Alternatively, is there a better (i.e. faster/uses less memory) method for storing the children nodes?
You want to look for a kart-trie. A kart-trie uses a BST like data structure with a simple hash. You can find a description here: http://code.dogmap.org/kart.
You could use a trie in place of a BST or list. For BST you'll have to compute a hash which could be as expensive as traversing the trie (I'm thinking of a trie with an array of pointers to children, where you use a character as an index). You'll end up with a trie of tries. A better solution could be to build a trie, and then compress the links that aren't branching.
This question was the last straw; and I've been wondering for a long time about it,
Why do people think about "Algorithms" and "Data structures" as about something that can be separated from each other?
I see a lot of evidence that they're separated in programmers' minds.
they request "Data Structures & Algorithms" books
they refer to "Data Structures" and "Algorithms" as separate university courses
they "know Algorithms", but are "weak in Data Structures" (can't find the link, sorry).
etc.
In my opinion "Data Structures" are algorithms, since the concept of "Data Structure" is about Algorithms to operate data that go in and out of the structures. But the opinion seems not a mainstream. What do I miss?
Edit: unfortunately, I did not formulate the question well. A separation of data structures and algorithms in programs people write is natural, since, well, the former is data, and the latter is functions (and in semi-functional frameworks like STL it's the core of the whole thing).
But the points above, and the question itself, refers to the way people think, to the way they arrange the knowledge in their heads. This doesn't have to even relate to the code writing.
Here are some links where people separate "algorithms" and "data structures" when they're the same thing:
Revisions: algorithm and data structure
They are different. Consider graphs, or trees to be more specific. Now, a tree appears to only be a tree. But you can browse it in preorder, inorder or postorder (3 algorithms for one structure).
You can have multiple or only 2 children for one node. The tree can be balanced (like AVL) or contain additional information (like B-tree indexes in data bases). That's different structures. But still you traverse them with the same algorithm.
See it now?
Another point: Algorithms sometimes are and sometimes are not independent from data structures. Certain algorithms have different complexity over different structures (finding paths in graph represented as list or a 2D table).
Algorithms and Data Structures are tightly wound together. Algorithm depends on data structures, if you change either of them, complexity will change considerably. They are not same, but are definitely two sides of the same coin. Selecting a good Data Structure is itself a path towards better algorithm.
For instance, Priority Queues can be implemented using binary heaps and binomial heaps, binary heaps allow peeking at highest priority element in constant time, whereas binomial heaps require O(log N) time for peeking.
So, a particular algorithm works best for that particular data-structure (in a particular context), hence Algorithms and Data Structures go hand-in-hand!
People refer to them as different entities because they are. Suppose I want to find an element from a set of data. If I put that data into an array, the array is a data-structure. Once it's in the array, I can use multiple different algorithms to find the element I'm interested in. I could sort the array (with any of multiple sorts) then use a binary search, I could just check each element linearly, etc. The choice of the array as the data structure I would use as opposed to say, a linked list, is not choosing an algorithm.
That said, it is important to understand one to understand the other. If you do not understand algorithms well then it is not obvious what the advantages and disadvantages of different data structures are, and vice versa. As such, it makes sense to teach them simultaneously. They are however different entities.
[Edit] Think about this: If you look at pseudo-code for most algorithms, a data structure isn't specified. You may have a "list" of elements to iterate through etc, but the exact implementation of that list is unimportant to the correctness of the algorithm.
I would say it's because functional programming separates what is operated on from the operations themselves. Targets and actions are certainly different, even if they're closely intertwined.
It was object-oriented programming that put data and operations into a single component. Perhaps if OO had come along earlier there would have been one discipline.
The way I see it is that algorithms are something that work with or on data structures, so there is a difference between the two. A simple data structure is an array, but there are a lot of algorithms that operate on simple arrays, so there has to be a way of separating the two. An array can also represent a tree, and trees are handled with specialized algorithms.
The difference isn't big, because you can't really have one without the other most of the times, but some times you can. Consider the trivial algorithm that determines whether a number is prime - it uses no data structures. Consider the GCD algorithm, also no data structures. You can talk about an algorithm without talking about data structures, but you can't talk about a data structure without talking about algorithms usually. You can talk about a tree, but you'll need algorithms for insertions, removals etc.
I think it's good that there is a distinction because they are, conceptually, different things. An algorithm is a set of steps used for accomplishing a task, while a data structure is something used to store data, the manipulation of said data is done with algorithms.
They are separate university courses. Typically, the data structures course emphasizes programming and is prerequisite to the algorithms course, which emphasizes mathematical analysis of algorithms. I don't think it's hard to see why many people with an undergraduate education in CS might think of them as separate.
I agree with you. Both are two sides of one and the same thing.
When talking about data structures, it's always about storing data in a way to optimize certain operations on this data, which leads us to algorithms and complexity.
The two are, of course, closely intertwined. This is why the posts you refer to requests books on both. Not always, though. The core of a sort algorithm, for example, is unchanged no matter what sort of data structure you're working on.
The title of the book Algorithm + Data Structures = Programs (1975) by none other than Niklaus Wirth suggests that both are essential in writing a program.
I want to save my objects according to a key in the attributes of my object in a sorted fashion. Later on I'll access these objects sequentially from max key to min key. I'll do some search tasks as well.
I consider to use either AVL tree or RB Tree. As far as i know they are nearly equivalent in theory(Both have O(logn)). But in practice which one might be better in performance in my situation. And is there a better alternative than those, considering that I'll mostly do insert and sequentially access to the ds.
Edit: I'm going to use java
For what it's worth, in C#, SortedDictionary<K, V> is implemented as a red-black tree, and in many implementations of the STL in C++, std::map<K, T> is implemented as a red-black tree.
Also, from Wikipedia on AVL vs. red-black trees:
The AVL tree is another structure
supporting O(log n) search, insertion,
and removal. It is more rigidly
balanced than red-black trees, leading
to slower insertion and removal but
faster retrieval. This makes it
attractive for data structures that
may be built once and loaded without
reconstruction, such as language
dictionaries (or program dictionaries,
such as the order codes of an
assembler or interpreter).
Which ever is easiest for you to implement, you won't get better insertion than log(n) with a sorted list and we'd probably need a lot more detail than what you've provided to decide if there are other factors that make another structure more appropriate.
As you're doing it in Java, consider using a TreeSet (although it's a Set, so you can't have duplicate entries)...
Are there any map data structures that have at least O(log n) insertion, deletion, access, and merging?
Most self-balancing binary trees such as AVL trees and red-black trees have most of these properties, but I believe they have O(n log n) merging. Are there any data structures that have faster merging?
Edit: I have looked around, and I can't find anything like this. If there is no such data structure, I would love some insight into why this is not possible.
I'd take a look at splay trees. You'll probably end up paying the merging cost along the way, but you should be able to inject another tree in and put the cost off until later.
Do you need a tree for arbitrary key types that only have a comparison defined, or would it be OK if it only works with types that have a fixed-size binary representation (int, long, float, double, ...)? If the latter is the case, then a binary radix tree is a data structure that has very efficient merging (O(1) if you are lucky, O(N) worst case).
See Fast Mergeable Integer Maps by Chris Okasaki and Andrew Gill for details of the data structure.
The Scala Collections Library contains an implementation for ints and longs. All other java primitive types can be translated to either ints or longs, e.g. by using java.lang.Double.doubleToLongBits for Double.