Let say I have a graph where the nodes is stored in a sorted list. I now want to topological sort this graph while keeping the original order where the topological order is undefined.
Are there any good algorithms for this?
One possibility is to compute the lexicographically least topological order. The algorithm is to maintain a priority queue containing the nodes whose effective in-degree (over nodes not yet processed) is zero. Repeatedly dequeue the node with the least label, append it to the order, decrement the effective in-degrees of its successors, enqueue the ones that now have in-degree zero. This produces 1234567890 on btilly's example but does not in general minimize inversions.
The properties I like about this algorithm are that the output has a clean definition obviously satisfied by only one order and that, whenever there's an inversion (node x appears after node y even though x < y), x's largest dependency is larger than y's largest dependency, which is an "excuse" of sorts for inverting x and y. A corollary is that, in the absence of constraints, the lex least order is sorted order.
The problem is two-fold:
Topological sort
Stable sort
After many errors and trials I came up with a simple algorithm that resembles bubble sort but with topological order criteria.
I thoroughly tested the algorithm on full graphs with complete edge combinations so it can be considered as proven.
Cyclic dependencies are tolerated and resolved according to original order of elements in sequence. The resulting order is perfect and represents the closest possible match.
Here is the source code in C#:
static class TopologicalSort
{
/// <summary>
/// Delegate definition for dependency function.
/// </summary>
/// <typeparam name="T">The type.</typeparam>
/// <param name="a">The A.</param>
/// <param name="b">The B.</param>
/// <returns>
/// Returns <c>true</c> when A depends on B. Otherwise, <c>false</c>.
/// </returns>
public delegate bool TopologicalDependencyFunction<in T>(T a, T b);
/// <summary>
/// Sorts the elements of a sequence in dependency order according to comparison function with Gapotchenko algorithm.
/// The sort is stable. Cyclic dependencies are tolerated and resolved according to original order of elements in sequence.
/// </summary>
/// <typeparam name="T">The type of the elements of source.</typeparam>
/// <param name="source">A sequence of values to order.</param>
/// <param name="dependencyFunction">The dependency function.</param>
/// <param name="equalityComparer">The equality comparer.</param>
/// <returns>The ordered sequence.</returns>
public static IEnumerable<T> StableOrder<T>(
IEnumerable<T> source,
TopologicalDependencyFunction<T> dependencyFunction,
IEqualityComparer<T> equalityComparer)
{
if (source == null)
throw new ArgumentNullException("source");
if (dependencyFunction == null)
throw new ArgumentNullException("dependencyFunction");
if (equalityComparer == null)
throw new ArgumentNullException("equalityComparer");
var graph = DependencyGraph<T>.TryCreate(source, dependencyFunction, equalityComparer);
if (graph == null)
return source;
var list = source.ToList();
int n = list.Count;
Restart:
for (int i = 0; i < n; ++i)
{
for (int j = 0; j < i; ++j)
{
if (graph.DoesXHaveDirectDependencyOnY(list[j], list[i]))
{
bool jOnI = graph.DoesXHaveTransientDependencyOnY(list[j], list[i]);
bool iOnJ = graph.DoesXHaveTransientDependencyOnY(list[i], list[j]);
bool circularDependency = jOnI && iOnJ;
if (!circularDependency)
{
var t = list[i];
list.RemoveAt(i);
list.Insert(j, t);
goto Restart;
}
}
}
}
return list;
}
/// <summary>
/// Sorts the elements of a sequence in dependency order according to comparison function with Gapotchenko algorithm.
/// The sort is stable. Cyclic dependencies are tolerated and resolved according to original order of elements in sequence.
/// </summary>
/// <typeparam name="T">The type of the elements of source.</typeparam>
/// <param name="source">A sequence of values to order.</param>
/// <param name="dependencyFunction">The dependency function.</param>
/// <returns>The ordered sequence.</returns>
public static IEnumerable<T> StableOrder<T>(
IEnumerable<T> source,
TopologicalDependencyFunction<T> dependencyFunction)
{
return StableOrder(source, dependencyFunction, EqualityComparer<T>.Default);
}
sealed class DependencyGraph<T>
{
private DependencyGraph()
{
}
public IEqualityComparer<T> EqualityComparer
{
get;
private set;
}
public sealed class Node
{
public int Position
{
get;
set;
}
List<T> _Children = new List<T>();
public IList<T> Children
{
get
{
return _Children;
}
}
}
public IDictionary<T, Node> Nodes
{
get;
private set;
}
public static DependencyGraph<T> TryCreate(
IEnumerable<T> source,
TopologicalDependencyFunction<T> dependencyFunction,
IEqualityComparer<T> equalityComparer)
{
var list = source as IList<T>;
if (list == null)
list = source.ToArray();
int n = list.Count;
if (n < 2)
return null;
var graph = new DependencyGraph<T>();
graph.EqualityComparer = equalityComparer;
graph.Nodes = new Dictionary<T, Node>(n, equalityComparer);
bool hasDependencies = false;
for (int position = 0; position < n; ++position)
{
var element = list[position];
Node node;
if (!graph.Nodes.TryGetValue(element, out node))
{
node = new Node();
node.Position = position;
graph.Nodes.Add(element, node);
}
foreach (var anotherElement in list)
{
if (equalityComparer.Equals(element, anotherElement))
continue;
if (dependencyFunction(element, anotherElement))
{
node.Children.Add(anotherElement);
hasDependencies = true;
}
}
}
if (!hasDependencies)
return null;
return graph;
}
public bool DoesXHaveDirectDependencyOnY(T x, T y)
{
Node node;
if (Nodes.TryGetValue(x, out node))
{
if (node.Children.Contains(y, EqualityComparer))
return true;
}
return false;
}
sealed class DependencyTraverser
{
public DependencyTraverser(DependencyGraph<T> graph)
{
_Graph = graph;
_VisitedNodes = new HashSet<T>(graph.EqualityComparer);
}
DependencyGraph<T> _Graph;
HashSet<T> _VisitedNodes;
public bool DoesXHaveTransientDependencyOnY(T x, T y)
{
if (!_VisitedNodes.Add(x))
return false;
Node node;
if (_Graph.Nodes.TryGetValue(x, out node))
{
if (node.Children.Contains(y, _Graph.EqualityComparer))
return true;
foreach (var i in node.Children)
{
if (DoesXHaveTransientDependencyOnY(i, y))
return true;
}
}
return false;
}
}
public bool DoesXHaveTransientDependencyOnY(T x, T y)
{
var traverser = new DependencyTraverser(this);
return traverser.DoesXHaveTransientDependencyOnY(x, y);
}
}
}
And a small sample application:
class Program
{
static bool DependencyFunction(char a, char b)
{
switch (a + " depends on " + b)
{
case "A depends on B":
return true;
case "B depends on D":
return true;
default:
return false;
}
}
static void Main(string[] args)
{
var source = "ABCDEF";
var result = TopologicalSort.StableOrder(source.ToCharArray(), DependencyFunction);
Console.WriteLine(string.Concat(result));
}
}
Given the input elements {A, B, C, D, E, F} where A depends on B and B depends on D the output is {D, B, A, C, E, F}.
UPDATE:
I wrote a small article about stable topological sort objective, algorithm and its proofing. Hope this gives more explanations and is useful to developers and researchers.
You have insufficient criteria to specify what you're looking for. For instance consider a graph with two directed components.
1 -> 2 -> 3 -> 4 -> 5
6 -> 7 -> 8 -> 9 -> 0
Which of the following sorts would you prefer?
6, 7, 8, 9, 0, 1, 2, 3, 4, 5
1, 2, 3, 4, 5, 6, 7, 8, 9, 0
The first results from breaking all ties by putting the lowest node as close to the head of the list as possible. Thus 0 wins. The second results from trying to minimize the number of times that A < B and B appears before A in the topological sort. Both are reasonable answers. The second is probably more pleasing.
I can easily produce an algorithm for the first. To start, take the lowest node, and do a breadth-first search to locate the distance to the shortest root node. Should there be a tie, identify the set of nodes that could appear on such a shortest path. Take the lowest node in that set, and place the best possible path from it to a root, and then place the best possible path from the lowest node we started with to it. Search for the next lowest node that is not already in the topological sort, and continue.
Producing an algorithm for the more pleasing version seems much harder. See http://en.wikipedia.org/wiki/Feedback_arc_set for a related problem that strongly suggests that it is, in fact, NP-complete.
Here's an easy iterative approach to topological sorting: continually remove a node with in-degree 0, along with its edges.
To achieve a stable version, just modify to: continually remove the smallest-index node with in-degree 0, along with its edges.
In pseudo-python:
# N is the number of nodes, labeled 0..N-1
# edges[i] is a list of nodes j, corresponding to edges (i, j)
inDegree = [0] * N
for i in range(N):
for j in edges[i]:
inDegree[j] += 1
# Now we maintain a "frontier" of in-degree 0 nodes.
# We take the smallest one until the frontier is exhausted.
# Note: You could use a priority queue / heap instead of a list,
# giving O(NlogN) runtime. This naive implementation is
# O(N^2) worst-case (when the order is very ambiguous).
frontier = []
for i in range(N):
if inDegree[i] == 0:
frontier.append(i)
order = []
while frontier:
i = min(frontier)
frontier.remove(i)
for j in edges[i]:
inDegree[j] -= 1
if inDegree[j] == 0:
frontier.append(j)
# Done - order is now a list of the nodes in topological order,
# with ties broken by original order in the list.
The depth-first search algorithm on Wikipedia worked for me:
const assert = chai.assert;
const stableTopologicalSort = ({
edges,
nodes
}) => {
// https://en.wikipedia.org/wiki/Topological_sorting#Depth-first_search
const result = [];
const marks = new Map();
const visit = node => {
if (marks.get(node) !== `permanent`) {
assert.notEqual(marks.get(node), `temporary`, `not a DAG`);
marks.set(node, `temporary`);
edges.filter(([, to]) => to === node).forEach(([from]) => visit(from));
marks.set(node, `permanent`);
result.push(node);
}
};
nodes.forEach(visit);
return result;
};
const graph = {
edges: [
[5, 11],
[7, 11],
[3, 8],
[11, 2],
[11, 9],
[11, 10],
[8, 9],
[3, 10]
],
nodes: [2, 3, 5, 7, 8, 9, 10, 11]
};
assert.deepEqual(stableTopologicalSort(graph), [5, 7, 11, 2, 3, 8, 9, 10]);
<script src="https://cdnjs.cloudflare.com/ajax/libs/chai/4.2.0/chai.min.js"></script>
Interpreting "stable topological sort" as a linearization of a DAG such that ranges in the linearization where the topological order doesn't matter, are sorted lexicographically. This can be solved with the DFS method of linearization, with the modification that nodes are visited in lexicographical order.
I have a Python Digraph class with a linearization method which looks like this:
def linearize_as_needed(self):
if self.islinearized:
return
# Algorithm: DFS Topological sort
# https://en.wikipedia.org/wiki/Topological_sorting#Depth-first_search
temporary = set()
permanent = set()
L = [ ]
def visit(vertices):
for vertex in sorted(vertices, reverse=True):
if vertex in permanent:
pass
elif vertex in temporary:
raise NotADAG
else:
temporary.add(vertex)
if vertex in self.arrows:
visit(self.arrows[vertex])
L.append(vertex)
temporary.remove(vertex)
permanent.add(vertex)
# print('visit: {} => {}'.format(vertices, L))
visit(self.vertices)
self._linear = list(reversed(L))
self._iter = iter(self._linear)
self.islinearized = True
Here
self.vertices
is the set of all vertices, and
self.arrows
holds the adjacency relation as a dict of left nodes to sets of right nodes.
Related
I'm solving a problem where you have N events (1 <= N <= 100000) over M days (2 <= M <= 10^9). You are trying to find the minimum time of occurrence for each event.
For each event, you know that it couldn't have occurred prior to a day Si. You also have C triples (1 <= C <= 10^5) described by (a, b, x). An event b must have occurred at least x days after a.
Example:
There are 4 events, spread over 10 days. Event 1 had to occur on Day 1 or after. Event 2 had to occur on Day 2 or after. Event 3 had to occur on Day 3 or after. Event 4 had to occur on Day 4 or after.
The triples are (1, 2, 5); (2, 4, 2); (3, 4, 4). This means that Event 2 had to occur at least 5 days after Event 1; Event 4 had to occur at least 2 days after Event 2; and Event 4 had to occur at least 4 days after Event 3.
The solution is that Event 1 occurred on Day 1; Event 2 occurred on Day 6; Event 3 occurred on Day 3; and Event 4 occurred on Day 4. The reasoning behind this is Event 2 occurred at least five days after Event 1, so it cannot have occurred before Day 1+5=6. Event 4 occurred at least two days after Event 2, so it cannot have occurred before Day 6+2=8.
My solution:
I had the idea to use the triples to create a Directed graph. So in the example above, the graph would look like this:
1 --5-> 2 --2-> 4
3 --4-> 4
Basically you create a directed edge from the Event that happened first to the Event that had to happen after. The edge weight would be the number of days it had to at least happen after.
I thought that we could would first use the input data to create the graph. Then, you would just Binary search on all possible starting dates of the first event (1 through 10^9, which is about 30). In this case, the first event is Event 1. Then, you would go through the graph and see if this starting date was possible. If you ever encountered an event where the date it was occurring was before its Si date, then you would terminate this search and continue binary searching. This solution would have worked easy if it wasn't for the "event b must have occurred AT LEAST x days after a".
Does anyone have any other solutions for solving this problem, or how to alter mine so that it works? Thank you! If you have any questions please let me know :))
This can be mapped to a Simple Temporal Network where literature is rich, e.g.:
Dechter, Rina, Itay Meiri, and Judea Pearl. "Temporal constraint networks." Artificial intelligence 49.1-3 (1991): 61-95..
Planken, Léon Robert. "Algorithms for simple temporal reasoning." (2013). full dissertation
As indicated in the comments, all-pairs shortest-paths can calculate the minimal-network (which also generates new arcs/constraints between all these events). If your graph is sparse, Johnson's algorithm is better than Floyd-Warshall.
If you don't care about the complete minimal-network, but only about the bounds of your events, you are only interested in the first column and the first row of the all-pairs shortest-paths distance matrix. You can calculate these values by applying Bellman-Ford *2*n* times. These values are the distances of root -> i and i -> root where root is time 0.
Just some remarks about things which Damien indicated (reasoning from scratch it seems: impressive):
we use negative weights in the general problem such that pure Dijkstra won't do
existance of negative cycle <-> infeasibility / no solution / inconsistent
there will be a need for some root vertex which is the origin of time
Edit: Above somewhat targets strong inference / propagation like giving tight bounds in regards to their value-domains.
If you are only interested in some consistent solution, it might be another idea just to post these constraints as linear-program and use one of the highly-optimized implementations to solve it (open-source world: CoinOR clp; maybe google's glop). Simplex-based ones should give you an integral solution (i think the problem is totally unimodular). Interior-point based solvers should be faster, but i'm not sure if your result will be integral without some additional need for cross-over. (might be a good idea to add some dummy-objective like min(max(x)) (makespan-like))
Consider a topological sort of your DAG.
For a list L corresponding to the toposort of your graph, you have at the end the leaves.
Then for a vertex just before
L = [..., v, leaves]
you know that the edges outoing from v can only go to the vertices after (here the leaves).
This allows you to compute the minimal weight associated to v by applying Damien's max.
Do so up to the head of L.
Topological sorting is O(V+E)
Here is an illustration with a more interesting graph (read it from top to bottom)
5
/ \
4 7
1 2
0
6
A topo ordering is (4601275)
So we will visit in order 4,6,0,1,2,7 then 5 and any vertex we visit has all its dependencies already computed.
Assume each vertex k has event occuring after 2^k days. The after date is referred as weight.
e.g vertex 4 is weighted 2^4
Assume each edge (i,j) is weighted 5*i + j
6 is weighted 2^6 = 64
0 is weighted max(2^0, 64 + (0*5+6)) = 70
1 takes max(2^1, 70 + 5) = 75
7 takes max(2^7, 75 + 5*7+1, 2^2) = 2^7
Point to be highlighted (here for 7) is that the minimal date induced by dependencies of a node may occur before the date attached to that node. (and we have to keep the biggest one)
function topologicalSort({ V, E }) {
const visited = new Set ()
const stack = []
function dfs (v) {
if (visited.has(v)) { return }
E.has(v) && E.get(v).forEach(({ to, w }) => dfs(to))
visited.add(v)
stack.push(v)
}
// process nodes without incoming edges first
const heads = new Set ([...V])
for (const v of V) {
const edges = E.get(v)
edges && edges.forEach(({ to }) => heads.delete(to))
}
for (const v of heads) {
dfs(v)
}
for (const v of V) {
dfs(v)
}
return stack
}
class G {
constructor () {
this.V = new Set()
this.E = new Map()
}
setEdges (from, tos) {
this.V.add(from)
tos.forEach(({ to, w }) => this.V.add(to))
this.E.set(from, tos)
}
}
function solve ({ g, vToWeight }) {
const stack = topologicalSort(g)
console.log('ordering', stack.join(''))
stack.forEach(v => {
const edges = g.E.get(v)
if (!edges) { return }
const newval = Math.max(
vToWeight.get(v),
...edges.map(({ to, w }) => vToWeight.get(to) + w)
)
console.log('setting best for', v, edges.map(({ to, w }) => [vToWeight.get(to), w].join('+') ))
vToWeight.set(v, newval)
})
return vToWeight
}
function demo () {
const g = new G ()
g.setEdges(2, [{ to: 1, w: 5 }])
g.setEdges(4, [{ to: 2, w: 2 }, { to: 3, w: 4 }])
const vToWeight = new Map ([
[1, 1],
[2, 6],
[3, 3],
[4, 4]
])
return { g, vToWeight }
}
function demo2 () {
const g = new G ()
const addEdges = (i, ...tos) => {
g.setEdges(i, tos.map(to => ({ to, w: 5 * i + to })))
}
addEdges(5,4,7)
addEdges(7,1,2)
addEdges(1,0)
addEdges(0,6)
const vToWeight = new Map ([...g.V].map(v => [v, 2**v]))
return { g, vToWeight }
}
function dump (map) {
return [...map].map(([k, v])=> k+'->'+v)
}
console.log('----op\s sol----\n',dump(solve(demo())))
console.log('----that case---\n',dump(solve(demo2())))
The distance matrix (between all pairs of events = nodes) can by obtained in a iterative way, similar to the Floyd algorithm. Basically, iteratively:
T(x, y) = max (T(x,y), T(x, z) +T (z, y))
However, as mentioned by the OP in a comment, Floyd algorithm is O(n^3), which is too much for a value of n up to 10^5.
A key point is that no loop exists, and therefore a more efficient algorithm should exist.
A nice proposal was made by grodzi in their proposal: use a topologic sort of the Directed Acyclic Graph (DAG).
I made an implementation in C++ according to this idea, with on main difference:
I used a simple sort (from C++ library) for building the topological sorting. Doing it is simple and has a complexity of O(n logn). The dedicated method proposed by grodzi could be more efficient (seems O(n)). However, it is very easy to implement and such a complexity remains low.
After the topological sorting, we know that a given event only depends on the events before it. For this part, this insures a complexity of O(C), where C is the number of triples, i.e. the number of edges.
#include <iostream>
#include <vector>
#include <set>
#include <unordered_set>
#include <algorithm>
#include <tuple>
#include <numeric>
struct Triple {
int event1;
int event2;
int days;
};
struct Pred {
int pred;
int days;
};
void print_result (const std::vector<int> &index, const std::vector<int> ×) {
int n = times.size();
for (int i = 0; i < n; i++) {
std::cout << index[i]+1 << " " << times[index[i]] << "\n";
}
}
std::tuple<std::vector<int>, std::vector<int>> ordering (int n, const std::vector<Triple> &triples) {
std::vector<int> index(n);
std::vector<int> times(n, 0);
std::iota(index.begin(), index.end(), 0);
// Build predecessors matrix and sets
std::vector<std::vector<Pred>> pred (n);
std::vector<std::unordered_set<int>> set_pred (n);
for (auto &triple: triples) {
pred[triple.event2 - 1].emplace_back(Pred{triple.event1 - 1, triple.days});
set_pred[triple.event2 - 1].insert(triple.event1 - 1);
}
// Topological sort
std::sort (index.begin(), index.end(), [&set_pred] (int &i, int &j) {return set_pred[j].find(i) != set_pred[j].end();});
// Iterative calculation of times of arrival
for (int i = 1; i < n; ++i) {
int ip = index[i];
for (auto &p: pred[ip]) {
times[ip] = std::max(times[ip], times[p.pred] + p.days);
}
}
// Final sort, according to times of arrival
std::sort (index.begin(), index.end(), [×] (int &i, int &j) {return times[i] < times[j];});
return {index, times};
}
int main() {
int n_events = 4;
std::vector<Triple> triples = {
{1, 2, 5},
{1, 3, 1},
{3, 2, 6},
{3, 4, 1}
};
std::vector<int> index(n_events);
std::vector<int> times(n_events);
std::tie (index, times) = ordering (n_events, triples);
print_result (index, times);
}
Result:
1 0
3 1
4 2
2 7
The problem I've seen is as below, anyone has some idea on it?
http://judgecode.com/problems/1002
Given a non-empty array of N integers A, please find the smallest integer P such that all the numbers in A are in the subarray A[0..P].
Since you have to scan all the items of the array, you have to implement at least O(N) algorithm. I suggest scanning the array while adding into a hash set, when the hash set size increased (i.e. a unique item added), memorize the index of the item. Return the last index memorized
C# implementation:
private static int MinIndex(IEnumerable<int> source) {
HashSet<int> used = new HashSet<int>();
int index = -1;
int result = -1;
foreach (int item in source) {
index += 1;
if (used.Add(item)) // unique item added
result = index; // the last unique item's index so far
}
return result;
}
Test
int[] sample = new [] { 2, 2, 1, 0, 1, };
Console.Write(MinIndex(sample));
Outcome is 3: all the distinct items of the initial array {0, 1, 2} are in the subarray [0..3] which is [2, 2, 1, 0]
Consider an infinite binary tree defined as follows.
For a node labelled v, let its left child be denoted 2*v and its right child 2*v+1. The root of the tree is labelled 1.
For a given n ranges [a_1, b_1], [a_2, b_2], ... [a_n, b_n] for which (a_i <= b_i) for all i, each range [a_i,b_i] denotes a set of all integers not less than a_i and not greater than b_i. For example, [5,9] would represent the set {5,6,7,8,9}.
For some integer T, let S represent the union [a_i, b_i] for all i up to n.
I need to find the number of unique pairs (irrespective of order) of elements x,y in S such that the lca(x,y) = T
(Wikipedia has a pretty good explanation of what the LCA of two nodes is.)
For example, for input:
A = {2, 12, 11}
B = {3, 13, 12}
T = 1
The output should be 6. (The ranges are [2,3], [12,13], and [11,12], and their union is the set {2,3,11,12,13}. Of all 20 possible pairs, exactly 6 of them ((2,3), (2,13), (3,11), (3,12), (11,13), and (12,13)) have an LCA of 1.)
And for input:
A = {1,7}
B = {2,15}
T = 3
The output should be 6. (The given ranges are [1,2] and [7,15], and their union is the set {1,2,7,8,9,10,11,12,13,14,15}. Of the 110 possible pairs, exactly 6 of them ((7,12), (7,13), (12,14), (12, 15), (13,14) and (13,15)) have an LCA of 3.)
Well, it is fairly simple to compute the LCA of two nodes in your notation, using this recursive method:
int lca(int a, int b) {
if(a == b) return a;
if(a < b) return lca(b, a);
return lca(a/2, b);
}
Now to find the union of the sets, we first need to be able to find what set a particular range represents. Lets introduce a factory method for this:
Set<Integer> rangeSet(int a, int b){
Set<Integer> result = new HashSet<Integer>(b-a);
for(int n = a; n <= b; n++) result.add(n);
return result;
}
This will return a Set<Integer> containing all the integers contained in the range.
To find the union of these sets, just addAll their elements to one set:
Set<Integer> unionSet(Set<Integer> ... sets){
Set<Integer> result = new HashSet<Integer>();
for(Set<Integer> s: sets)
result.addAll(s);
return result;
}
Now, we need to iterate over all possible pairs in the set:
pairLcaCount(int t, Set<Integer> nodes){
int result = 0;
for(int x: nodes)
for(int y: nodes)
if(x > y && lca(x,y) == t) result++;
return result;
}
Everything else is just glue logic, methods to convert from your input requirements to the ones taken here. For instance, something like:
Set<Integer> unionSetFromBoundsLists(int[] a, int[] b){
Set<Integer> [] ranges = new Set<Integer>[a.length];
for(int idx = 0; idx < ranges.length; idx++)
ranges[idx] = rangeSet(a[idx], b[idx]);
return unionSet(ranges);
}
I'm stuck on a code challenge, and I want a hint.
PROBLEM: You are given a tree data structure (without cycles) and are asked to remove as many "edges" (connections) as possible, creating smaller trees with even numbers of nodes. This problem is always solvable as there are an even number of nodes and connections.
Your task is to count the removed edges.
Input:
The first line of input contains two integers N and M. N is the number of vertices and M is the number of edges. 2 <= N <= 100.
Next M lines contains two integers ui and vi which specifies an edge of the tree. (1-based index)
Output:
Print the number of edges removed.
Sample Input
10 9
2 1
3 1
4 3
5 2
6 1
7 2
8 6
9 8
10 8
Sample Output :
2
Explanation : On removing the edges (1, 3) and (1, 6), we can get the desired result.
I used BFS to travel through the nodes.
First, maintain an array separately to store the total number of child nodes + 1.
So, you can initially assign all the leaf nodes with value 1 in this array.
Now start from the last node and count the number of children for each node. This will work in bottom to top manner and the array that stores the number of child nodes will help in runtime to optimize the code.
Once you get the array after getting the number of children nodes for all the nodes, just counting the nodes with even number of nodes gives the answer. Note: I did not include root node in counting in final step.
This is my solution. I didn't use bfs tree, just allocated another array for holding eachnode's and their children nodes total number.
import java.util.Scanner;
import java.util.Arrays;
public class Solution {
public static void main(String[] args) {
int tree[];
int count[];
Scanner scan = new Scanner(System.in);
int N = scan.nextInt(); //points
int M = scan.nextInt();
tree = new int[N];
count = new int[N];
Arrays.fill(count, 1);
for(int i=0;i<M;i++)
{
int u1 = scan.nextInt();
int v1 = scan.nextInt();
tree[u1-1] = v1;
count[v1-1] += count[u1-1];
int root = tree[v1-1];
while(root!=0)
{
count[root-1] += count[u1-1];
root = tree[root-1];
}
}
System.out.println("");
int counter = -1;
for(int i=0;i<count.length;i++)
{
if(count[i]%2==0)
{
counter++;
}
}
System.out.println(counter);
}
}
If you observe the input, you can see that it is quite easy to count the number of nodes under each node. Consider (a b) as the edge input, in every case, a is the child and b is the immediate parent. The input always has edges represented bottom-up.
So its essentially the number of nodes which have an even count(Excluding the root node). I submitted the below code on Hackerrank and all the tests passed. I guess all the cases in the input satisfy the rule.
def find_edges(count):
root = max(count)
count_even = 0
for cnt in count:
if cnt % 2 == 0:
count_even += 1
if root % 2 == 0:
count_even -= 1
return count_even
def count_nodes(edge_list, n, m):
count = [1 for i in range(0, n)]
for i in range(m-1,-1,-1):
count[edge_list[i][1]-1] += count[edge_list[i][0]-1]
return find_edges(count)
I know that this has already been answered here lots and lots of time. I still want to know reviews on my solution here. I tried to construct the child count as the edges were coming through the input and it passed all the test cases.
namespace Hackerrank
{
using System;
using System.Collections.Generic;
using System.Linq;
class Program
{
static void Main(string[] args)
{
var tempArray = Console.ReadLine().Split(' ').Select(x => Convert.ToInt32(x)).ToList();
int verticeNumber = tempArray[0];
int edgeNumber = tempArray[1];
Dictionary<int, int> childCount = new Dictionary<int, int>();
Dictionary<int, int> parentDict = new Dictionary<int, int>();
for (int count = 0; count < edgeNumber; count++)
{
var nodes = Console.ReadLine().Split(' ').Select(x => Convert.ToInt32(x)).ToList();
var node1 = nodes[0];
var node2 = nodes[1];
if (childCount.ContainsKey(node2))
childCount[node2]++;
else childCount.Add(node2, 1);
var parent = node2;
while (parentDict.ContainsKey(parent))
{
var par = parentDict[parent];
childCount[par]++;
parent = par;
}
parentDict[node1] = node2;
}
Console.WriteLine(childCount.Count(x => x.Value % 2 == 1) - 1);
}
}
}
My first inclination is to work up from the leaf nodes because you cannot cut their edges as that would leave single-vertex subtrees.
Here's the approach that I used to successfully pass all the test cases.
Mark vertex 1 as the root
Starting at the current root vertex, consider each child. If the sum total of the child and all of its children are even, then you can cut that edge
Descend to the next vertex (child of root vertex) and let that be the new root vertex. Repeat step 2 until you have traversed all of the nodes (depth first search).
Here's the general outline of an alternative approach:
Find all of the articulation points in the graph.
Check each articulation point to see if edges can be removed there.
Remove legal edges and look for more articulation points.
Solution - Traverse all the edges, and count the number of even edges
If we remove an edge from the tree and it results in two tree with even number of vertices, let's call that edge - even edge
If we remove an edge from the tree and it results in two trees with odd
number of vertices, let's call that edge - odd edge
Here is my solution in Ruby
num_vertices, num_edges = gets.chomp.split(' ').map { |e| e.to_i }
graph = Graph.new
(1..num_vertices).to_a.each do |vertex|
graph.add_node_by_val(vertex)
end
num_edges.times do |edge|
first, second = gets.chomp.split(' ').map { |e| e.to_i }
graph.add_edge_by_val(first, second, 0, false)
end
even_edges = 0
graph.edges.each do |edge|
dup = graph.deep_dup
first_tree = nil
second_tree = nil
subject_edge = nil
dup.edges.each do |e|
if e.first.value == edge.first.value && e.second.value == edge.second.value
subject_edge = e
first_tree = e.first
second_tree = e.second
end
end
dup.remove_edge(subject_edge)
if first_tree.size.even? && second_tree.size.even?
even_edges += 1
end
end
puts even_edges
Note - Click Here to check out the code for Graph, Node and Edge classes
I have a dependency graph that I have represented as a Map<Node, Collection<Node>> (in Java-speak, or f(Node n) -> Collection[Node] as a function; this is a mapping from a given node n to a collection of nodes that depend on n). The graph is potentially cyclic*.
Given a list badlist of nodes, I would like to solve a reachability problem: i.e. generate a Map<Node, Set<Node>> badmap that represents a mapping from each node N in the list badlist to a set of nodes which includes N or other node that transitively depends on it.
Example:
(x -> y means node y depends on node x)
n1 -> n2
n2 -> n3
n3 -> n1
n3 -> n5
n4 -> n2
n4 -> n5
n6 -> n1
n7 -> n1
This can be represented as the adjacency map {n1: [n2], n2: [n3], n3: [n1, n5], n4: [n2, n5], n6: [n1], n7: [n1]}.
If badlist = [n4, n5, n1] then I expect to get badmap = {n4: [n4, n2, n3, n1, n5], n5: [n5], n1: [n1, n2, n3, n5]}.
I'm floundering with finding graph algorithm references online, so if anyone could point me at an efficient algorithm description for reachability, I'd appreciate it. (An example of something that is not helpful to me is http://www.cs.fit.edu/~wds/classes/cse5081/reach/reach.html since that algorithm is to determine whether a specific node A is reachable from a specific node B.)
*cyclic: if you're curious, it's because it represents C/C++ types, and structures can have members which are pointers to the structure in question.
In Python:
def reachable(graph, badlist):
badmap = {}
for root in badlist:
stack = [root]
visited = set()
while stack:
v = stack.pop()
if v in visited: continue
stack.extend(graph[v])
visited.add(v)
badmap[root] = visited
return badmap
here's what I ended up using, based on #quaint's answer:
(requires a few Guava classes for convenience)
static public <T> Set<T> findDependencies(
T rootNode,
Multimap<T, T> dependencyGraph)
{
Set<T> dependencies = Sets.newHashSet();
LinkedList<T> todo = Lists.newLinkedList();
for (T node = rootNode; node != null; node = todo.poll())
{
if (dependencies.contains(node))
continue;
dependencies.add(node);
Collection<T> directDependencies =
dependencyGraph.get(node);
if (directDependencies != null)
todo.addAll(directDependencies);
}
return dependencies;
}
static public <T> Multimap<T,T> findDependencies(
Iterable<T> rootNodes,
Multimap<T, T> dependencyGraph)
{
Multimap<T, T> dependencies = HashMultimap.create();
for (T rootNode : rootNodes)
dependencies.putAll(rootNode,
findDependencies(rootNode, dependencyGraph));
return dependencies;
}
static public void testDependencyFinder()
{
Multimap<Integer, Integer> dependencyGraph =
HashMultimap.create();
dependencyGraph.put(1, 2);
dependencyGraph.put(2, 3);
dependencyGraph.put(3, 1);
dependencyGraph.put(3, 5);
dependencyGraph.put(4, 2);
dependencyGraph.put(4, 5);
dependencyGraph.put(6, 1);
dependencyGraph.put(7, 1);
Multimap<Integer, Integer> dependencies =
findDependencies(ImmutableList.of(4, 5, 1), dependencyGraph);
System.out.println(dependencies);
// prints {1=[1, 2, 3, 5], 4=[1, 2, 3, 4, 5], 5=[5]}
}
You maybe should build a reachability matrix from your adjacency list for fast searches. I just found the paper Course Notes for CS336: Graph Theory - Jayadev Misra
which describes how to build the reachability matrix from a adjacency matrix.
If A is your adjacency matrix, the reachability matrix would be R = A + A² + ... + A^n where n is the number of nodes in the graph. A², A³, ... can be calculated by:
A² = A x A
A³ = A x A²
...
For the matrix multiplication the logical or is used in place of + and the logical and is used in place of x. The complexity is O(n^4).
Ordinary depth-first search or breadth-first search will do the trick: execute it once for each bad node.
Here's a working Java solution:
// build the example graph
Map<Node, Collection<Node>> graph = new HashMap<Node, Collection<Node>>();
graph.put(n1, Arrays.asList(new Node[] {n2}));
graph.put(n2, Arrays.asList(new Node[] {n3}));
graph.put(n3, Arrays.asList(new Node[] {n1, n5}));
graph.put(n4, Arrays.asList(new Node[] {n2, n5}));
graph.put(n5, Arrays.asList(new Node[] {}));
graph.put(n6, Arrays.asList(new Node[] {n1}));
graph.put(n7, Arrays.asList(new Node[] {n1}));
// compute the badmap
Node[] badlist = {n4, n5, n1};
Map<Node, Collection<Node>> badmap = new HashMap<Node, Collection<Node>>();
for(Node bad : badlist) {
Stack<Node> toExplore = new Stack<Node>();
toExplore.push(bad);
Collection<Node> reachable = new HashSet<Node>(toExplore);
while(toExplore.size() > 0) {
Node aNode = toExplore.pop();
for(Node n : graph.get(aNode)) {
if(! reachable.contains(n)) {
reachable.add(n);
toExplore.push(n);
}
}
}
badmap.put(bad, reachable);
}
System.out.println(badmap);
Just like with Christian Ammer, you take for A the adjacency matrix and use Boolean arithmetic, when doing the following, where I is the identity matrix.
B = A + I;
C = B * B;
while (B != C) {
B = C;
C = B * B;
}
return B;
Furthermore, standard matrix multiplication (both arithmetical and logical) is O(n^3), not O(n^2). But if n <= 64, you can sort of get rid of one factor n, because you can do 64 bits in parallel on nowadays 64 bits machines. For larger graphs, 64 bits parallelism is useful, too, but shader techniques might even be better.
EDIT: one can do 128 bits in parallel with SSE instructions, with AVX even more.