Non bi partite matching algorithm - algorithm

I'm trying to find a specific algorithm allowing me to match people depending of many criterias. All of them are in the same set, and edges cannot have common vertices. Basically like a dating website but as I said : only one set, so not bi partite.
Despite researches I wasn't able to find this algorithm, almost everything is about bipartite, or allow common vertices. I'm specifically looking for a perfect matching (that can be slow).
It seems the algorithm is supposed to be based on the Ford Furkerson algorithm (which usually is for bipartite matching), but I still don't get how to apply it to that. Do you have any clues ? Thanks

You can find the maximum matching in a non-bipartite graph using the Blossom algorithm (it's quite complicated so I'll not describe it here).
Once you have the maximum matching, checking if it's perfect is very straightforward (just compare its size with the number of vertices in the graph).

Related

Why do we need to handle blossoms when finding maximum matchings?

We know the standard algorithm for finding maximum matching in general graph.
https://en.wikipedia.org/wiki/Blossom_algorithm
What I am trying to understand is that what is the need to handle blossom separately?
I think finding augmenting path and complementing it is enough. It works with odd cycle as well.
You are correct that you can form a maximum matching by using the following general algorithm:
If such a path exists, augment along that path to increase the size of the matching by one. Repeat.
If no such path exists, your matching is maximum and you're done.
The challenge, though, is determining whether such an augmenting path exists in the graph. With a small graph it might not be all that hard to find a path, but as the graphs get larger and larger and the partial matchings increase in size it can become pretty challenging. Brute-force searching through all the possible paths is not feasible at this scale.
In the case where the graph is bipartite (there are no odd-length cycles), there are nice algorithms that do this that are based on a modification of breadth-first or depth-first search. They work nicely because you can classify nodes as being either at an odd or even distance from a start node. With odd cycles, this breaks down, and these simple algorithms don't work.
The reason the blossom algorithm exists - and the reason why blossoms are there in the first place - is that they provide a mechanism for efficiently searching the graph for augmenting paths even in the presence of blossoms. The intuition is that every time you see a blossom, you can contract it down to a point in a way that doesn't mess up your ability to find augmenting paths. Contracting an odd cycle reduces the size of the graph, and through recursion you'll either end a graph where it's easy to find an augmenting path or you'll find that there aren't any.
So in a sense, the reason we use blossoms is that it enables us to efficiently (in polynomial time) check whether augmenting paths exist and, if so, find them and augment across them.

what kind of algorithm is this (stable marriage variation)?

I have a set of objects (between 1 and roughly 500). Each object is compatible with certain (zero or more) other objects from that same set.
Can anyone give me some pointers as to how to determine the best way to create pairs of objects that are compatible with each other so that most of the objects in the set are paired?
You're looking for a maximum matching in a general graph. As opposed to the stable marriage problem with which you are familiar, in the maximum matching problem the input graph is not necessarily bipartite. There is no notion of stability (as vertices do not rank their compatible options) and what you're looking for is a subset of the edges of the graph such that no two edges share a common vertex (a.k.a., a matching). You're trying to construct that matching which contains the maximum possible number of edges.
Luckily, the problem of finding a maximum matching in a general graph can be solved in polynomial time using Edmond's matching algorithm (also known as the blossom algorithm because of how it contracts blossoms (odd cycles) into single vertices). The time complexity of Edmond's matching algorithm is O(E•V^2). While not very efficient, I believe this is good enough for the relatively small graphs you're dealing with. You don't even have to implement it from scratch by yourself as there's an open source Java implementation of Edmond's algorithm you can use. However, if you're interested in the state of the art, you can use the most efficient algorithm known for the problem which runs in O(E•sqrt(V)).
If the vertex compatibility of your input is not dichotomous (that is, each vertex has a ranking specifying its preferences among its neighbors), you can add corresponding weights to the edges to accommodate for the preference profile and use the variation of Edmond's algorithm for weighted graphs.

How to divide a connected weighted graph to N semi-equal subgraphs

I have a graph of many hundred nodes that are mainly connected with each other. I can do processing on entire graph but it really takes a lot of time, so I would like to divide it to smaller sub-graphs of approximately similar size.
With other words. I have a collection of aerial images and I do pairwise image matching on all of them. As a result I get a set of matches for each pair (pixel from first image matched with pixel on second image). Number of matches is considered as weight of this (undirected) edge. These edges then form a graph mentioned above.
I'm not so familiar with graph theory (as it is a very broad topic). What is the best algorithm for this job?
Thank you.
Edit:
This problem has a perfect analogy which I think is easier to understand. Imagine you have a set of people and their connections/friendships, like I social network. Each friendship has a numeric value/weight representing how good friends they are. So in a large group of people I want to get k most interconnected sub-groups .
Unfortunately, the problem you're describing is almost certainly NP-hard. From a graph perspective, you have a graph where each edge has a weight on it. You're trying to split the graph into relatively equal pieces while cutting the lowest total cost of edges cut. This problem is called the maximum k-cut problem and is NP-hard. If you introduce the constraint that you also want to try to make the pieces roughly even in size, you have the balanced k-cut problem, which is also NP-hard.
The good news is that there are nice approximation algorithms for these problems, so if you're looking for solutions that are just "good enough," then you can probably find a library somewhere that implements them. There are also other techniques like spectral clustering which work well in practice and are really fast, but which don't have any guarantees on how well they'll do.

Weighted bipartite matching

I'm aware of there's a lot of similar topics. But most of them left me some doubts in my case. What I want to do is find perfect matching (or as close to perfect as possible in case there's no perfect matching of course) and then from all of those matchings where you are able to match k out of n vertexes (where k is highest possible), I want to choose the highest possible total weight.
So simply put what I'm saying is following priority:
Match as many vertexes as possible
Because (non weighted) maximum matching in most cases is unambiguous, I want choose the one that have the biggest sum of weights on edges. If there are several of them with same weight it doesn't matter which would be chosen.
I've heard about Ford-Fulkerson algorithm. Is it working in the way I describe it or I need other algorithm?
If you're implementing this yourself, you probably want to use the Hungarian algorithm. Faster algorithms exist but aren't as easy to understand or implement.
Ford-Fulkerson is a maximum flow algorithm; you can use it easily to solve unweighted matching. Turning it into a weighted matcing algorithm requires an additional trick; with that trick, you wind up with the Hungarian algorithm.
You can also use a min-cost flow algorithm to do weighted bipartite matching, but it might not work quite as well. There's also the network simplex method, but it seems to be mostly of historical interest.

Graph Isomorphism

Is there an algorithm or heuristics for graph isomorphism?
Corollary: A graph can be represented in different different drawings.
What s the best approach to find different drawing of a graph?
It is a hell of a problem.
In general, the basic idea is to simplify the graph into a canonical form, and then perform comparison of canonical forms. Spanning trees are generated with this objective, but spanning trees are not unique, so you need to have a canonical way to create them.
After you have canonical forms, you can perform isomorphism comparison (relatively) easy, but that's just the start, since non-isomorphic graphs can have the same spanning tree. (e.g. think about a spanning tree T and a single addition of an edge to it to create T'. These two graphs are not isomorph, but they have the same spanning tree).
Other techniques involve comparing descriptors (e.g. number of nodes, number of edges), which can produce false positive in general.
I suggest you to start with the wiki page about the graph isomorphism problem. I also have a book to suggest: "Graph Theory and its applications". It's a tome, but worth every page.
As from you corollary, every possible spatial distribution of a given graph's vertexes is an isomorph. So two isomorph graphs have the same topology and they are, in the end, the same graph, from the topological point of view. Another matter is, for example, to find those isomorph structures enjoying particular properties (e.g. with non crossing edges, if exists), and that depends on the properties you want.
One of the best algorithms out there for finding graph isomorphisms is VF2.
I've written a high-level overview of VF2 as applied to chemistry - where it is used extensively. The post touches on the differences between VF2 and Ullmann. There is also a from-scratch implementation of VF2 written in Java that might be helpful.
A very similar problem - graph automorphism - can be solved by saucy, which is available in source code. This finds all symmetries of a graph. If you have two graphs, join them into one and any isomorphism can be discovered as an automorphism of the join.
Disclaimer: I am one of co-authors of saucy.
There are algorithms to do this -- however, I have not had cause to seriously investigate them as of yet. I believe Donald Knuth is either writing or has written on this subject in his Art of Computing series during his second pass at (re)writing them.
As for a simple way to do something that might work in practice on small graphs, I would recommend counting degrees, then for each vertex, also note the set of degrees for those vertexes that are adjacent. This will then give you a set of potential vertex isomorphisms for each point. Then just try all those (via brute force, but choosing the vertexes in increasing order of potential vertex isomorphism sets) from this restricted set. Intuitively, most graph isomorphism can be practically computed this way, though clearly there would be degenerate cases that might take a long time.
I recently came across the following paper : http://arxiv.org/abs/0711.2010
This paper proposes "A Polynomial Time Algorithm for Graph Isomorphism"
My project - Griso - at sf.net: http://sourceforge.net/projects/griso/ with this description:
Griso is a graph isomorphism testing utility written in C++. It is based on my own POLYNOMIAL-TIME (in this point the salt of the project) algorithm. See Griso's sample input/output on http://funkybee.narod.ru/graphs.htm page.
nauty and Traces
nauty and Traces are programs for computing automorphism groups of graphs and digraphs [*]. They can also produce a canonical label. They are written in a portable subset of C, and run on a considerable number of different systems.
AutGroupGraph command in GRAPE's package of GAP.
bliss: another symmetry and canonical labeling program.
conauto: a graph ismorphism package.
As for heuristics: i've been fantasising about a modified Ullmann's algorithm, where you don't only use breadth first search but mix it with depth first search the way, that first you use breadth first search intensively, than you set a limit for breadth analysis and go deeper after checking a few neighbours, and you lower the breadh every step at some amount. This is practically how i find my way on a map: first locate myself with breadth first search, then search the route with depth first search - largely, and this is the best evolution of my brain has ever invented. :) On the long term some intelligence may be added for increasing breadth first search neighbour count at critical vertexes - for example where there are a large number of neighbouring vertexes with the same edge count. Like checking your actual route sometimes with the car (without a gps).
I've found out that the algorithm belongs in the category of k-dimension Weisfeiler-Lehman algorithms, and it fails with regular graphs. For more here:
http://dabacon.org/pontiff/?p=4148
Original post follows:
I've worked on the problem to find isomorphic graphs in a database of graphs (containing chemical compositions).
In brief, the algorithm creates a hash of a graph using the power iteration method. There might be false positive hash collisions but the probability of that is exceedingly small (i didn't had any such collisions with tens of thousands of graphs).
The way the algorithm works is this:
Do N (where N is the radius of the graph) iterations. On each iteration and for each node:
Sort the hashes (from the previous step) of the node's neighbors
Hash the concatenated sorted hashes
Replace node's hash with newly computed hash
On the first step, a node's hash is affected by the direct neighbors of it. On the second step, a node's hash is affected by the neighborhood 2-hops away from it. On the Nth step a node's hash will be affected by the neighborhood N-hops around it. So you only need to continue running the Powerhash for N = graph_radius steps. In the end, the graph center node's hash will have been affected by the whole graph.
To produce the final hash, sort the final step's node hashes and concatenate them together. After that, you can compare the final hashes to find if two graphs are isomorphic. If you have labels, then add them (on the first step) in the internal hashes that you calculate for each node.
There is more background here:
https://plus.google.com/114866592715069940152/posts/fmBFhjhQcZF
You can find the source code of it here:
https://github.com/madgik/madis/blob/master/src/functions/aggregate/graph.py

Resources