Efficient data structure to store subsets of a graph - data-structures

I have derived some subsets of a graph.Now I want to store this subsets and label them such as subset_1,subset_2 etc.Which data structure would be efficient to do that?
This is the main graph
The circled marks are the subsets

The subgraph is actually a graph. You can use Graph to store these subsets/subgraph.

Related

Graph data structure selection

I need to implement an algorithm that works on planar graphs and I want to select an appropriate data structure to represent them.
The vertices are stored in an array, each with an associated pair of coordinates,
An edge has an associated polyline between its end vertices, with an arbitrary number of intermediate points (possibly none), stored in sequence in an auxiliary array.
The edges are undirected (if a=>b exists, then b=>a exists),
The following primitive operations must be supported:
adding an edge between two vertices designated by their indexes,
enumerating all edges originating from a given vertex (and recursively, all paths from a given vertex),
for a given edge, follow the associated polyline until the end vertex.
I am looking for a data structure that is space efficient O(V + E) and avoids data redundancy.
What would you use ? Candidates that I see are adjacency lists, DCEL, winged edges, but I may be missing one. I guess that quad edges is overkill.

Is there any other Data structure to represent Graph other than Adjacency List or Adjacency Matrix?

I was looking for different Data structures for representing Graph and I came accross Nvidia CUDA Toolkit and found out new way to represent graph with the help of source_indices, destination_offsets.
Fascinated by this innovative representation of graph, I searched out for other ways of representing Graphs. But not found anything new.
I was wondering if there was any other way to represent Graph other than Adjacency Matrix or Lists...
I was wondering if there was any other way to represent Graph other
than Adjacency Matrix or Lists...
There are alternatives to the adjacency list or the adjacency matrix, such as edge list, adjacency map or forward star to name a few. Given this graph (images taken from here):
this is the adjacency matrix representation:
this is the adjacency list representation:
this would be another alternative, the edge list:
and another pretty common one is the forward star representation:
If you get into this research field you will find a good number of approaches, mainly optimizations for specific cases, taking into account factors such as:
Graph size (number of nodes)
Density of the graph
Directed or undirected graph
Static or dynamic graph
Graph known at compile time or constructed at runtime
Node IDs (labeled sequentially or not)
...
These optimizations can, for example, support reordering of the nodes in a preprocessing stage to increase reference locality. There is a lot of work for shortest path algorithms, specially when calculating the shortest path in a world map.
One example of optimization would be a dynamic graph structure (Packed-Memory Graph (PMG)) which is suited for large-scale transportation networks.
There is another representation of graphs using Adjacency Set. It is very much similar to adjacency list but instead of using Linked lists, Disjoint Sets [Union-Find] are used. You can read about disjoint sets ADT here.
If E is the number of edges and V is the number of vertices in the graph, then Adjacency set representation of graph takes up (E+V) space.
Complexities of other operations while using adjacency set representation:
Checking edge between vertex v and w : log(Degree(v))
Iterate over edges incident to vertex v: Degree(v)

Adjacency list with O(1) look up time using HashSet?

In my algorithms class I've been told that a draw back of Adjacency Lists for graph representation is the O(n) look up time for iterating through the array of adjacent nodes corresponding to each node. I implement my adjacency list by using a HashMap that maps nodes to a HashSet of their adjacent nodes, wouldn't that only take O(1) look up time? Is there something I'm missing?
As you know look up for value using key in HashMap is O(1). However, in adjacency list the value of the HashMap is also a list of its adjacent nodes. The main purpose of the adjacency list is to iterate the adjacent nodes. For example: graph traversal algorithms like DFS and BFS. In your case HashSet. Suppose number of elements in HashSet is n. Then for iterating all the elements even in HashSet is O(n).
So, total complexity would be O(1)+O(n).
Where O(1)= look up in HashMap
O(n)= iterate all the elements
Generally, Adjacency List is preferable for sparse graph because it is the graph with only a few edges. It means the number of adjacent elements in each node(key of HashMap) is less. So the look up for a element wont cost more.
I implement my adjacency list by using a HashMap that maps nodes to a HashSet of their adjacent nodes, wouldn't that only take O(1) look up time? [emphasis mine]
Right — but "adjacency list" normally implies a representation as an array or a linked-list rather than a HashSet: in other words, adjacency lists are optimized for iterating over a vertex's neighbors rather than for querying if two vertices are neighbors.
It may be possible to produce more time-efficient graph representations than adjacency lists, particularly for graphs where vertices vertex often have many edges.
With a map of vertices where each vertex contains a map of neighbor vertices and/or edge objects, we can look if nodes are connected in O(1) time by indexing a vertex id and then indexing a neighbor. That's potentially a big savings over an adjacency list where we might have to loop over many edges to find specific neighbors. Furthermore, a map-of-maps data structure can allow us to store arbitrary data in edge objects. That's useful for weighted graphs and features of actions/edges

Graph implementation knowing edges ahead of time

I'm looking for an efficent way to implement a weighted undirected graph knowing only the number of edges ahead of time.
sample input:
N (number of edges)
A B x (x is the distance from A to B)
.
.
I've thinked to use adjacency lists of Node* (I need to know neighbours) and stored nodes in a dynamic hash table (I don't know how many nodes I'll take so I need a dynamic - search/insert - container).
Are there better ways to do it?
Sorry for my bad english! :D
Given the format you're getting the input in, a very reasonable approach would be to use either a hash table of lists, where the keys are the nodes and the values are lists of pairs of (node, distance). Alternatively, if you have a dense graph and want to be able to quickly determine the distance from one node to another, it might be good to have a hash table of hash tables, where the top level hash table maps nodes to a second hash table, which then maps each node the original node has an edge to to its cost. This still lets you iterate across a node's outgoing edges, but gives you faster lookup of distances.
Another idea (depending on the use case) would be to start off by building the first data structure (the hash table of lists), then to post process it by building an adjacency matrix. This would be useful if you didn't need to iterate across a node's outgoing edges and needed fast random access to distances between nodes. It is similar to the hash table of hash tables, but is probably more space efficient.
Hope this helps!

Adjacency List structure in HBase

I'm trying to implement the following graph reduction algorithm in
The graph is an undirected weighted graph
I want to strip away all nodes with only two neighbors
and update the weights
Have a look at the following illustration:
Algorithm reduce graph http://public.kungi.org/graph-reduction.png
The algorithm shall transform the upper graph into the lower one. Eliminate node 2 and update the weight of the edge to: w(1-3) = w(1-2)+w(2-3)
Since I have a very large graph I'm doing this with MapReduce.
My Question is how to represent the graph in HBase. I thought about building an adjacency list structure in HBase like this:
Column families: nodes, neighbors
1 -> 2, 6, 7
...
Is there a nicer way to do this?
Adjacency lists are the most frequently recommended structure.
You could use each node ID as the row ID and neighbor IDs as column qualifiers, with the weights as values.

Resources