Algorithm for subgraph combinations search - algorithm

With this undirected graph
In this graph I have different nodes with following types [A,B,C,D,E].
It means that is possible to exists different nodes with same type
Now imagine you have a set of node types [A,B,E]. You don´t know which node are those given nodes in the graph, only thing you know is the type of each node.
What you have to do is to find a best fit for that given set of nodes.
Node has to be connected to each other
I´ve been testing an algorithm which consists in steps below:
Convert the graph to a linked list
Generate all possible combinations between all nodes considering those given types and how many times a node type appears. The given example is [A,B,E] but it could be other set such as [A,B,C,A].
Some of possible (not all) combinations for [A,B,E] are:
Check if nodes in those combinations are connected to each other
the best fit is the first combination where all nodes are connected.
The problem is number of nodes in the given graph. For small sets of nodes and small graphs the algorthim is ok. But when the number of nodes are increased I have a thousands of possible combinations and those combinations consume a lot of memory.
I`ve been searching for some algorithm which could be able to solve this problem efficiently with low cost memory.
I have spent days reading and testing all kind of algorithm and until now I couldn`t find a better solution.
Suggestions are very appreciated

This is called the Graph Motif problem and unfortunately it's NP-hard, even when the graph is a tree with maximum degree 3: see Theorem 1 in https://people.mpi-inf.mpg.de/~hermelin/Conference%20Publications/Connected%20Motifs.pdf
This means it's very unlikely that any polynomial-time algorithm exists that can solve this problem.

Related

How to partition a graph, so as the weight of the nodes in each partition would be between two numbers?

There is a specific problem in operations research that is converted into a graph:
I would like to partition the graph into n sub-graphs, so as the nodes in sub-graphs are connected, and the sum of the weights of the nodes is between two numbers, a and b.
Since I'm coming from an operations research background, I first thought about proposing a MIP model in order to solve the problem, but soon there were a lot of problems on how to model the constraints. I understand the usual edge-cut algorithms and have used them to tackle the problem.
I'll describe my thought process until now, hoping you have any ideas on this problem:
I generate a spanning tree, then I pick up some n edges, so the number of the partitions would be fine, from now on I need to add or delete nodes from each partition to reach a feasible response. At this stage, I feel like there should be a logical process. I tried to describe a procedure on paper, but couldn't get it right.
I wish also to know if the problem of partitioning the graph into n subgraphs with weights between a and b, could be feasible, given a special graph.

Minimum number of cameras required to cover all nodes in graph

I came across a problem in leetcode named "Binary Tree Camera".
I was wondering how to approach this similar problem:-
You have to place cameras at nodes of a graph such that whole graph is covered. A camera on a node monitors all its immediate neighbour nodes and itself. Find the minimum number of cameras required to cover all nodes.
This is a set cover problem, which there are many well-known algorithms for. To model it as an instance of the set cover problem, map each node to the set of nodes which a camera at that node would cover. The original problem of choosing the fewest number of nodes is equivalent to choosing the fewest number of those sets.
In general, this is an "NP Hard" problem, meaning there is no known algorithm which always gives the minimum covering and also scales well to large instances of the problem. Since the problem asks for the minimum, a heuristic algorithm is not suitable, so you will need to do something like a backtracking search.
This problem is called the Minimum Dominating Set and is NP-hard for the general graph case. Algorithms exist that approach this problem by either approximation, parameterization or restricting the class of graphs. See the Wikipedia link for details.

Graph algorithms?

What would be the most efficient way if I want to identify if a graph can be grouped into two sub graphs where in each sub graph each node is connected to every other node.For Example here in the image the graph can be grouped into two sub graphs consisting of 4 nodes and 5 nodes respectively where in each sub graph every node is connected to every other node. Suppose I am given a graph and I am to check whether the above fact holds true or not,what would be the most efficient way or algorithm to do so.
This is known as the clique problem. A clique is a subset of vertices such that every two vertices in the subset are connected in the original graph (that is, they form a complete subgraph). You are asking for an algorithm that lists all maximal cliques (cliques that are not part of larger cliques). There are various algorithms for this. The usual one used is the Bron-Kerbosch algorithm. It has a worst-case running time of O(3n/3), where n is the number of vertices in the graph.
Other algorithms may be applicable if your graph has special structure (e.g., planar, which your example is not).
EDIT: Actually, if your decision problem is whether the graph can be partitioned into exactly two cliques, there's a more efficient algorithm, as described in this thread (which then becomes a duplicate of your question).

Is it possible to develop an algorithm to solve a graph isomorphism?

Or will I need to develop an algorithm for every unique graph? The user is given a type of graph, and they are then supposed to use the interface to add nodes and edges to an initial graph. Then they submit the graph and the algorithm is supposed to confirm whether the user's graph matches the given graph.
The algorithm needs to confirm not only the neighbours of each node, but also that each node and each edge has the correct value. The initial graphs will always have a root node, which is where the algorithm can start from.
I am wondering if I can develop the logic for such an algorithm in the general sense, or will I need to actually code a unique algorithm for each unique graph. It isn't a big deal if it's the latter case, since I only have about 20 unique graphs.
Thanks. I hope I was clear.
Graph isomorphism problem might not be hard. But it's very hard to prove this problem is not hard.
There are three possibilities for this problem.
1. Graph isomorphism problem is NP-hard.
2. Graph isomorphism problem has a polynomial time solution.
3. Graph isomorphism problem is neither NP-hard or P.
If two graphs are isomorphic, then there exist a permutation for this isomorphism. Take this permutation as a certificate, we could prove this two graphs are isomorphic to each other in polynomial time. Thus, graph isomorphism lies in the territory of NP set. However, it has been more than 30 years that no one could prove whether this problem is NP-hard or P. Thus, this problem is intrinsically hard despite its simple problem description.
If I understand the question properly, you can have ONE single algorithm, which will work by accepting one of several reference graphs as its input (in addition to the input of the unknown graph which isomorphism with the reference graph is to be asserted).
It appears that you seek to assert whether a given graph is exactly identical to another graph rather than asserting if the graphs are isomorph relative to a particular set of operations or characteristics. This implies that the algorithm be supplied some specific reference graph, rather than working off some set of "abstract" rules such as whether neither graphs have loops, or both graphs are fully connected etc. even though the graphs may differ in some other fashion.
Edit, following confirmation that:
Yeah, the algorithm would be supplied a reference graph (which is the answer), and will then check the user's graph to see if it is isomorphic (including the values of edges and nodes) to the reference
In that case, yes, it is quite possible to develop a relatively simple algorithm which would assert isomorphism of these two graphs. Note that the considerations mentioned in other remarks and answers and relative to the fact that the problem may be NP-Hard are merely indicative that a simple algorithm [or any algorithm for that matter] may not be sufficient to solve the problem in a reasonable amount of time for graphs which size and complexity are too big. However, assuming relatively small graphs and taking advantage (!) of the requirement that the weights of edges and nodes also need to match, the following algorithm should generally be applicable.
General idea:
For each sub-graph that is disconnected from the rest of the graph, identify one (or possibly several) node(s) in the user graph which must match a particular node of the reference graph. By following the paths from this node [in an orderly fashion, more on this below], assert the identity of other nodes and/or determine that there are some nodes which cannot be matched (and hence that the two structures are not isomorphic).
Rough pseudo code:
1. For both the reference and the user supplied graph, make the the list of their Connected Components i.e. the list of sub-graphs therein which are disconnected from the rest of the graph. Finding these connected components is done by following either a breadth-first or a depth-first path from starting at a given node and "marking" all nodes on that path with an arbitrary [typically incremental] element ID number. Once a given path has been fully visited, repeat the operation from any other non-marked node, and do so until there are no more non-marked nodes.
2. Build a "database" of the characteristics of each graph.
This will be useful to identify matching candidates and also to determine, early on, instances of non-isomorphism.
Each "database" would have two kinds of "records" : node and edge, with the following fields, respectively:
- node_id, Connected_element_Id, node weight, number of outgoing edges, number of incoming edges, sum of outgoing edges weights, sum of incoming edges weight.
node
- edge_id, Connected_element_Id, edge weight, node_id_of_start, node_id_of_end, weight_of_start_node, weight_of_end_node
3. Build a database of the Connected elements of each graph
Each record should have the following fields: Connected_element_id, number of nodes, number of edges, sum of node weights, sum of edge weights.
4. [optionally] Dispatch the easy cases of non-isomorphism:
4.a mismatch of the number of connected elements
4.b mismatch of of number of connected elements, grouped-by all fields but the id (number of nodes, number of edges, sum of nodes weights, sum of edges weights)
5. For each connected element in the reference graph
5.1 Identify candidates for the matching connected element in the user-supplied graph. The candidates must have the same connected element characteristics (number of nodes, number of edges, sum of nodes weights, sum of edges weights) and contain the same list of nodes and edges, again, counted by grouping by all characteristics but the id.
5.2 For each candidate, finalize its confirmation as an isomorph graph relative to the corresponding connected element in the reference graph. This is done by starting at a candidate node-match, i.e. a node, hopefully unique which has the exact same characteristics on both graphs. In case there is not such a node, one needs to disqualify each possible candidate until isomorphism can be confirmed (or all candidates are exhausted). For the candidate node match, walk the graph, in, say, breadth first, and by finding matches for the other nodes, on the basis of the direction and weight of the edges and weight of the nodes.
The main tricks with this algorithm is are to keep proper accounting of the candidates (whether candidate connected element at higher level or candidate node, at lower level), and to also remember and mark other identified items as such (and un-mark them if somehow the hypothetical candidate eventually proves to not be feasible.)
I realize the above falls short of a formal algorithm description, but that should give you an idea of what is required and possibly a starting point, would you decide to implement it.
You can remark that the requirement of matching nodes and edges weights may appear to be an added difficulty for asserting isomorphism, effectively simplify the algorithm because the underlying node/edge characteristics render these more unique and hence make it more likely that the algorithm will a) find unique node candidates and b) either quickly find other candidates on the path and/or quickly assert non-isomorphism.

Finding equal subgraphs

Given:
a directed Graph
Nodes have labels
the same label can appear more than once
edges don't have labels
I want to find the set of largest (connected) subgraphs which are equal taking the labels of the nodes into account.
The graph could be huge (millions of nodes) does anyone know an efficient solution for this?
I'm looking for algorithm and ideally a Java implementation.
Update: Since this problem is most likely NP-complete. I would also be interested in an algorithm that produces an approximated solution.
This seems to be close at least:
Frequent Subgraphs
I strongly suspect that's NP-hard.
Even if all the labels are the same that's at least as hard as graph isomorphism. (Join the two graphs together as a single disconnected graph; are the largest equal subgraphs the two original graphs?)
If identical labels are relatively rare it might be tractable.

Resources