What is the purpose of using a bag object for the entries of an adjacency list in a graph? Why not use a data structure like a stack or a queue?
The most important part for adjacency list is that it is iterable. While stack or queue would certainly work, they aren't prefered data structures for iteration.
Another important property of bag is that it holds elements in order. It works as multiset in C++.
Related
I am having a hard time differentiating between the Abstract Data Type (ADT) and Logical Data Structure (LDS). The only difference I can think of is that ADTs have defined operations, where as LDS are more about how the data is stored. But a Stack can be an ADT as we know what kind of operations must be denied to call something a 'stack' and also it can be called an LDT as we know how the data is should be 'structured' to call it a 'stack'. Both ADT and LDS are 'abstract' in that we don't talk about how they are implemented.
So is it correct that ADT and LDS are different names for the same thing, but depending on where we come from we can call it ADT or LDS?
An abstract data type (ADT) is a mathematical model along with some operations that are performed on that model. For example, the stack ADT consists of a sequence of integers (say) along with the operations push, pop, isempty, makeemptystack and top. An ADT is similar to an interface in Java, and are the specs. Data structures are about how these specs are implemented. For example, the stack ADT operations can be implemented using an array data structure, or using a linked list data structure. A queue ADT can be implemented using a circular array data structure in such a manner that all the ADT operations can be done in O(1) time.
In a real-world problem, you would encounter only a subset of the possible operations, which form your ADT, and you need to find a data structure that would implement exactly this subset of operations efficiently. For example, in some applications, you want to maintain a set of integers, and the only operations you would do on this set are to find the value of the smallest element in the set, delete the smallest element in the set, and insert a new element into the set. (Applications include Dijkstra's shortest path algorithm, Prim's minimum spanning tree algorithm, and Huffman coding.) A set, with these three operations MIN, EXTRACTMIN and INSERT, define the min-priority queue ADT. A data structure that can implement all these three operations effficiently - in O(log n) time - is a minheap. Other data structures - such as linked lists, unsorted arrays, sorted arrays - would take O(n) time for one or more of these operations, and hence are less efficient for this particular ADT.
In this problem, I have a set of elements that are indexed from 1 to n. Each element actually corresponds to a graph node and I am trying to calculate random one-to-one matchings between the nodes. For the sake of simplicity, I neglect further details of the actual problem. I need to write a fast algorithm to randomly consume these elements (nodes) and do this operation multiple times in order to calculate different matchings. The purpose here is to create randomized inputs to another algorithm and each calculated matching at the end of this will be another input to that algorithm.
The most basic algorithm I can think of is to create copies of the elements in the form of an array, generate random integers, and use them as array indices to apply swap operations. This way each random copy can be created in O(n) but in practice, it uses a lot of copy and swap operations. Performance is very important and I am looking for faster ways (algorithms and data structures) of achieving this goal. It just needs to satisfy the two conditions:
It shall be able to consume a random element.
It shall be able to consume an element on the given index.
I tried to write as clear as possible. If you have any questions, feel free to ask and I am happy to clarify. Thanks in advance.
Note: Matching is an operation where you pair the vertices on a graph if there exists an edge between them.
Shuffle index array (for example, with Fisher-Yates shuffling)
ia = [3,1,4,2]
Walk through index array and "consume" set element with current index
for x in ia:
consume(Set[indexed by x])
So for this example you will get order Set[3], Set[1], Set[4], Set[2]
No element swaps, only array of integers is changed
I've implemented a disjoint set data structure for my program and I realized I need to iterate over all equivalence classes.
Searching the web, I didn't find any useful information on the best way to implement that or how it influences complexity. I'm quite surprised since it seems like something that would be needed quite often.
Is there a standard way of doing this? I'm thinking about using a linked list (I use C so I plan to store some pointers in the top element of each equivalence class) and updating it on each union operation. Is there a better way?
You can store pointers to top elements in hash-based set or in any balanced binary search tree. You only need to delete and add elements - both operations in these structures run in O(1)* and in O(logN) respectively. In linked list they run in O(N).
Your proposal seems very reasonable. If you thread a doubly-linked list through the representatives, you can splice out an element from the representatives list in time O(1) and then walk the list each time you need to list representatives.
#ardenit has mentioned that you can also use an external hash table or BST to store the representatives. That's certainly simpler to code up, though I suspect it won't be as fast as just threading a linked list through the items.
I read few books, but still didn't understand why its considered as linear. Not sure because of appearance or sequential access or something else. If possible please explain in some logical terms.
Data structures fall into two categories: Linear and Non-Linear. A data structure is said to be linear if the elements form a sequence, for example Array, Linked list, queue etc. Elements in a nonlinear data structure do not form a sequence, for example Tree, Hash tree, Binary tree, etc.
There are two ways of representing linear data structures in memory. One way is to have the linear relationship between the elements by means of sequential memory locations. Such linear structures are called arrays. The other way is to have the linear relationship between the elements represented by means of links. Such linear data structures are called linked list.
That is so because they follow a linear - fashion, they move from one block to another step - by - step.
Linked lists, Stack, Queues are linear because they have connected in a manner that they can have only one descendant at any node. Unlike trees and graphs which can have one or more child or nodes connected to a given node.
Just imagine nature .
When the events occur one after the other - you go for linear (time events , age , weather etc ) .
When the events branch out - you go for non linear (for example food-chain , biological nomenclature , classification of living beings into kingdoms , family etc )
Linked list :-
Linked list is a collection of nodes .
It has a end to end connection.one node is hold the address of another node or next node i.e. nodes are dependent on each other. They are connected in a linear manner.hence linked list is called as a linear data structure
Linear data structures:
the elements (objects) are sequential and ordered in a way so that:
there is only one first element and has only one next element,
there is only one last element and has only one previous element, while
all other elements have a next and a previous element
https://codeandwork.github.io/courses/java/linearDataStructures.html
non-linear data structures are those that are not linear!
A linear data structure traverses the data elements sequentially, in which only one data element can directly be reached. Ex: Arrays, Linked Lists.
But in doubly linked list we can reach two data elements using previous pointer and next pointer.
So can we say that doubly linked list is a non linear data structure?
Correct me if I am wrong.
Thank you.
Non-linear data structures are those data-structure in which the elements appear in a non-linear fashion,which requires two or more than two-dimensional representation . The elements may OR mayn't(mostly) be stored in contiguous memory locations,rather in any order/non-linearly as if you have skipped the elements in between. Accessing the elements are also done in an out-of-order pattern.
Example :- A Tree, here one may iterate from root to right child,to its right child,... and so on---thereby skipping all the left nodes.
But, in doubly linked list, you have to move sequentially(linearly) only, to move forward(using forward pointer) or backward(using previous pointer).
You can't jump from any element in the list to any distant element without traversing the intermediary elements.
Hence, doubly-linked list is a linear data structure. In a linear data structure, the elements are arranged in a linear fashion(that is,one-dimensional representation).
You are wrong; 2 justifications:
While you can get to 2 elements from any node, one of them was the one you used to get to this node, so you can only get to one new node from each.
It is still linear in that it has to be traversed sequentially, or in a line.
It is still sequential: you need to go over some elements in the list to get to a particular element, compared to an array where you can randomly access each element.
However, you can go linearly forwards or backwards, which may optimize the search.
linked list is basically a linear data Structure because it stores data in a linear fashion. A linear data Structure is what which stores data in a linear format and the traversing is in sequential manner and not in zigzag way.
It depends on where you intend to apply linked lists. If you based it on storage, a linked list is considered non-linear. On the other hand, if you based it on access strategies, then a linked list is considered linear.