Find the extreme for priority function / alphabet order - algorithm

We have an array of elements a1,a2,...aN from an alphabet E. Assuming |N| >> |E|.
For each symbol of the alphabet we define an unique integer priority = V(sym). Let's define V{i} := V(symbol(ai)) for the simplicity.
How can I find the priority function V for which:
Count(i)->MAX | V{i} < V{i+1}
In other words, I need to find the priorities / permutation of the alphabet for which the number of positions i, satisfying the condition V{i}<V{i+1}, is maximum.
Edit-1: Please, read carefully. I'm given an array ai and the task is to produce a function V. It's not about sorting the input array with a priority function.
Edit-2: Example
E = {a,b,c}; A = 'abcab$'; (here $ = artificial termination symbol, V{$}=+infinity)
One of the optimal priority functions is: V{a}=1,V{b}=2,V{c}=3, which gives us the following signs between array elements: a<b<c>a<b<$, resulting in 4 '<' signs of 5 total.

If elements could not have tied priorities, this would be trivial. Just sort by priority. But you can have equal priorities.
I would first sort the alphabet by priority. Then I'd extract the longest rising subsequence. That is the start of your answer. Extract the longest rising subsequence from what remains. Append that to your answer. Repeat the extraction process until the entire alphabet has been extracted.
I believe that this gives an optimal result, but I haven't tried to prove it. If it is not perfectly optimal, it still will be pretty good.
Now that I think I understand the problem, I can tell you that there is no good algorithm to solve it.
To see this let us first construct a directed graph whose vertices are your elements, and whose edges indicate how many times one element immediately preceeded another. You can create a priority function by dropping enough edges to get a directed acyclic graph, use the edges to create a partially ordered set, and then add order relations until you have a full linear order, from which you can trivially get a priority function. All of this is straightforward once you have figured out which edges to drop. Conversely given that directed graph and your final priority function, it is easy to figure out which set of edges you had to decide to drop.
Therefore your problem is entirely equivalent to figuring out a minimal set of edges you need to drop from athat directed graph to get athat directed acyclic graph. However as http://en.wikipedia.org/wiki/Feedback_arc_set says, this is a known NP hard problem called the minimum feedback arc set. begin update It is therefore very unlikely that there is a good algorithm for the graphs you can come up with. end update
If you need to solve it in practice, I'd suggest going for some sort of greedy algorithm. It won't always be right, but it will generally give somewhat reasonable results in reasonable time.
Update: Moron is correct, I did not prove NP-hard. However there are good heuristic reasons to believe that the problem is, in fact, NP-hard. See the comments for more.

There's a trivial reduction from Minimum Feedback Arc Set on directed graphs whose arcs can be arranged into an Eulerian path. I quote from http://www14.informatik.tu-muenchen.de/personen/jacob/Publications/JMMN07.pdf :
To the best of our knowledge, the
complexity status of minimum feedback
arc set in such graphs is open.
However, by a lemma of Newman, Chen,
and Lovász [1, Theorem 4], a
polynomial algorithm for [this problem]
would lead to a 16/9 approximation
algorithm for the general minimum
feedback arc set problem, improving
over the currently best known O(log n
log log n) algorithm [2].
Newman, A.: The maximum acyclic subgraph problem and degree-3 graphs.
In: Proceedings of the 4th
International Workshop on
Approximation Algorithms for
Combinatorial Optimization Problems,
APPROX. LNCS (2001) 147–158
Even,G.,Naor,J.,Schieber,B.,Sudan,M.:Approximatingminimumfeedbacksets
and multi-cuts in directed graphs. In:
Proceedings of the 4th International
Con- ference on Integer Programming
and Combinatorial Optimization. LNCS
(1995) 14–28

Related

Maximum weighted Hungarian method Using minimum Hungarian method

I have programmed the minimum Hungarian algorithm for a bipartite graph, with Dijkstra's algorithm to find the minimum cost of a maximum matching. However, I want to use such an algorithm to implement the maximum Hungarian algorithm and don't know if it's correct to just negate the edges, because I don't know if the algorithm will handle it.
My implementation is based on the explanation on the following site: https://www.ics.uci.edu/~eppstein/163/lecture6b.pdf
Given G=(AUB, E), the idea is to label the vertices via an artificial start vertex s which has edges with unsaturated nodes in A, and run Dijkstra's algorithm from s in order to label each vertex, then after labeling each, the edges will be reweighted by their original weight minus the labels of the edge's endpoints.
I have read a lot of articles, and the only I could see is that a minimum Hungarian algorithm can be handled well with maximum cost by negating each edge, however, I am afraid that due to the fact that Dijkstra's algorithm doesn't handle negative edges well, it won't work.
First find the maximum weight in your graph. Then negate all of the weights and add the maximum weight to them. Adding the original maximum to all of the negated values makes them all positive.
You can also use INT_MAX (or whatever is equivalent to it in the programming language you're using) instead of the maximum weight. This skips the step of finding the maximum weight, but could make the first iteration of the Hungarian Algorithm take longer, or cause you to need an extra iteration of the algorithm to get the result. It probably doesn't make much of a difference either way and the performance difference will vary based on the particular weights in your graph.

Algorithm for independent set of a graph?

is there an algorithm for finding all the independent sets of an directed graph ?
From what i've read an independent set represents a set formed by the nodes that are not adjacent.
So for this example I would have {1} {2} {1,3}
So how is possible to find all of them, I am thinking about something recursive but I don't really know the algorithm, if someone could point me in the right direction it would be much appreciated !
Thank you!
Typical way to find independent sets is to consider the complement of a graph. A complement of a graph is defined as a graph with the same set of vertices and an edge between a pair if and only if there is no edge between them in the original graph. An independent set in the graph corresponds to a clique in the complements. Finding all the cliques is exponential in complexity so you can not improve brute force much. Still I believe considering the complement of the graph may make the problem easier to deal with.
Other than complement and finding cliques, I can also think about "Graph Coloring", you color the vertices somehow that no two adjacent vertices have the same color (you can do it with a very simple heuristic algorithm like SL = Smallest Last), and then choose vertices in every color as a subset (as a maximal independent subset).
The only problem is that there are probably too many ways of coloring a graph. You have to keep all the found (maximal) independent sets and move on until you get enough sets!
The Bron–Kerbosch algorithm is commonly used for this problem, see the Wikipedia article for a description and pseudocode that can be turned into a useable program without too much problem. The size of output is, in the worst case, exponential in the number of vertices, but brute force will always be exponential while BK will be polynomial if the output is polynomial. In other words if you know that the output will be reasonable then BK will produce it in a reasonable time. This is an active area of research and there are a number of other algorithms that do the same thing with varying efficiency depending of the type and size of graph. There are applications in several areas, in particular genetics.

Topological sort of cyclic graph with minimum number of violated edges

I am looking for a way to perform a topological sorting on a given directed unweighted graph, that contains cycles. The result should not only contain the ordering of vertices, but also the set of edges, that are violated by the given ordering. This set of edges shall be minimal.
As my input graph is potentially large, I cannot use an exponential time algorithm. If it's impossible to compute an optimal solution in polynomial time, what heuristic would be reasonable for the given problem?
Eades, Lin, and Smyth proposed A fast and effective heuristic for the feedback arc set problem. The original article is behind a paywall, but a free copy is available from here.
There’s an algorithm for topological sorting that builds the vertex order by selecting a vertex with no incoming arcs, recursing on the graph minus the vertex, and prepending that vertex to the order. (I’m describing the algorithm recursively, but you don’t have to implement it that way.) The Eades–Lin–Smyth algorithm looks also for vertices with no outgoing arcs and appends them. Of course, it can happen that all vertices have incoming and outgoing arcs. In this case, select the vertex with the highest differential between incoming and outgoing. There is undoubtedly room for experimentation here.
The algorithms with provable worst-case behavior are based on linear programming and graph cuts. These are neat, but the guarantees are less than ideal (log^2 n or log n log log n times as many arcs as needed), and I suspect that efficient implementations would be quite a project.
Inspired by Arnauds answer and other interesting topological sorting algorithms have I created the cyclic-toposort project and published it on github. cyclic_toposort does exactly what you desire in that it quickly sorts a directed cyclic graph providing the minimal amount of violating edges. It optionally also provides the maximum groupings of nodes that are on the same topological level (and can therefore be activated at the same time) if desired.
If the problem is still relevant to you then I would be happy if you check out my project and let me know what you think!
This project was useful to my own neural network topology research, so I had to create something like this anyway. I am answering your question this late in case anyone else stumbles upon this thread in search for the same question.

Back Tracking Vs. Greedy Algorithm Maximum Independent Set

I implemented a back tracing algorithm using both a greedy algorithm and a back tracking algorithm.
The back tracking algorithm is as follows:
MIS(G= (V,E): a graph): largest set of independent vertices
1:if|V|= 0
then return .
3:end if
if | V|= 1
then return V
end if
pick u ∈ V
Gout←G−{u}{remove u from V and E }
Gn ← G−{ u}−N(u){N(u) are the neighbors of u}
Sout ←MIS(Gout)
Sin←MIS(Gin)∪{u}
return maxsize(Sout,Sin){return Sin if there’s a tie — there’s a reason for this.
}
The greedy algorithm is to iteratively pick the node with the smallest degree, place it in the MIS and then remove it and its neighbors from G.
After running the algorithm on varying graph sizes where the probability of an edge existing is 0.5, I have empirically found that the back tracking algorithm always found a smaller a smaller maximum independent set than the greedy algorithm. Is this expected?
Your solution is strange. Backtracking is usually used to yes/no problems, not optimization. The algorithm you wrote depends heavily on how you pick u. And it definitely is not backtracking because you never backtrack.
Such problem can be solved in a number of ways, e.g.:
genetic programming,
exhaustive searching,
solving the problem on dual graph (maximum clique problem).
According to Wikipedia, this is a NP-hard problem:
A maximum independent set is an independent set of the largest possible size for a given graph G.
This size is called the independence number of G, and denoted α(G).
The problem of finding such a set is called the maximum independent set problem and is an NP-hard optimization problem.
As such, it is unlikely that there exists an efficient algorithm for finding a maximum independent set of a graph.
So, for finding the maximum independent set of a graph, you should test all available states (with an algorithm which its time complexity is exponential). All other faster algorithms (like greedy, genetic or randomize ones), can not find the exact answer. They can guarantee to find a maximal independent set, but not the maximum one.
In conclusion, I can say that your backtracking approach is slower and accurate; but the greedy approach is only an approximation algorithm.

Algorithm for finding optimal node pairs in hexagonal graph

I'm searching for an algorithm to find pairs of adjacent nodes on a hexagonal (honeycomb) graph that minimizes a cost function.
each node is connected to three adjacent nodes
each node "i" should be paired with exactly one neighbor node "j".
each pair of nodes defines a cost function
c = pairCost( i, j )
The total cost is then computed as
totalCost = 1/2 sum_{i=1:N} ( pairCost(i, pair(i) ) )
Where pair(i) returns the index of the node that "i" is paired with. (The sum is divided by two because the sum counts each node twice). My question is, how do I find node pairs that minimize the totalCost?
The linked image should make it clearer what a solution would look like (thick red line indicates a pairing):
Some further notes:
I don't really care about the outmost nodes
My cost function is something like || v(i) - v(j) || (distance between vectors associated with the nodes)
I'm guessing the problem might be NP-hard, but I don't really need the truly optimal solution, a good one would suffice.
Naive algos tend to get nodes that are "locked in", i.e. all their neighbors are taken.
Note: I'm not familiar with the usual nomenclature in this field (is it graph theory?). If you could help with that, then maybe that could enable me to search for a solution in the literature.
This is an instance of the maximum weight matching problem in a general graph - of course you'll have to negate your weights to make it a minimum weight matching problem. Edmonds's paths, trees and flowers algorithm (Wikipedia link) solves this for you (there is also a public Python implementation). The naive implementation is O(n4) for n vertices, but it can be pushed down to O(n1/2m) for n vertices and m edges using the algorithm of Micali and Vazirani (sorry, couldn't find a PDF for that).
This seems related to the minimum edge cover problem, with the additional constraint that there can only be one edge per node, and that you're trying to minimize the cost rather than the number of edges. Maybe you can find some answers by searching for that phrase.
Failing that, your problem can be phrased as an integer linear programming problem, which is NP-complete, which means that you might get dreadful performance for even medium-sized problems. (This does not necessarily mean that the problem itself is NP-complete, though.)

Resources