DFS Greedy Chromatic Number - algorithm

In my school I learned that calculating chromatic number of a arbitrary graph is NP-Complete.
I understand why the greddy algorithm does not work, but what about DFS/Greedy algorithm?
The main idea is do a DFS an for all the vertex not yet colored, take the minimum color index over all the neighbours.
I can't figure out a counter example and this question is blowing my mind.
Thanks for all of your answers.
Pseudocode
Chromatic(Vertex x){
for each neighbour y of vertex x
if color(y) = -1
color(y) <- minimum color over all the neighbours of y
if(y>=numColor) numColors++;
Chromatic(y);
}
Main(){
Set the color of all vertex equal -1
Take an arbitrary vertex u and set color(u) = 0
numColors = 1;
Chromatic(u);
print numColors;
}

Here's a concrete counterexample: the petersen graph. Your algorithm computes 4, regardless of where you start (I think), but the graph's chromatic index is 3.
The petersen graph is a classical counterexample for many greedy attempts at graph problems, and also for conjectures in graph theory.

The answer is that sometimes you will have a vertex which has 2 colors available, and making the wrong choice will cause a problem an undetermined time later.
Suppose you have vertices 1 through 9. Draw them around a circle. Then add edges to make the following true.
1, 2, 3 form a triangle.
3 connects to 4.
4, 5, 6 make a triangle.
5, 6, 7 make a triangle.
6, 7, 8 make a triangle.
7, 8, 9 make a triangle.
8, 9, 1 make a triangle.
9, 1, 2 make a triangle.
It is easy to color this with 3 colors. But a depth-first greedy algorithm has a choice of 2 colors it can give to vertex 4. Make the wrong choice, and you'll wind up needing 4 colors, not 3.

Related

How to use Dijkstra's algorithm to generate an undirected, unweighted graph of a given diameter?

My job is to generate a random undirected, unweighted graph with a given diameter d. What I did already is to generate a random distance matrix D where each element Dij represents the distance between the i-th and j-th vertices of the graph. So, basically I am doing this:
if (i == j) {
D[i][j] = 0;
} else {
D[i][j] = D[j][i] = random.nextInt(d + 1);
}
The diagonal is zero because it always needs zero effort to reach the same vertex, am I right?
Also, Dij = Dji because it is undirected. Are my assumptions right?
I want to use java, but I tagged the question as language-agnostic because I need an algorithm and not a code.
My next step is to use Dijkstra's algorithm to generate a random graph by generating an adjacency matrix. I think that Dijkstra's algorithm is to find the shortest path, but can I use it for my case?
EDIT #1:
As you can see in the figure above, the diameter is 4 because the most distanced vertices are 2 and 7 have a distance of 4. For that reason, we have D[2][7] = D[7][2] = 4. Another example is D[3][6] = D[6][3] = 2, because if we want to go from 3 to 6, we can go 3 -> 5 -> 6, or 3 -> 1 -> 6 and vice versa for going from 6 to 3.
What I am looking for is to generate a random graph by knowing the diameter which is the maximum distance between two vertices in the graph. I know there are a lot of possibilities of the graph, but I need any of them.
I have an idea which is assuming that the number of vertices is d + 1, then connecting each vertex to the following vertex. In this case we will have a linear graph.
Example (diameter = 2, number of vertices = 3):
v
1
2
3
1
0
1
2
2
1
0
1
3
2
1
0
The diagonal = zero
D1,2 = D2,1 = D2,3 = D3,2 = 1, because to go from 1 to 2, or 2 to 3, there is a direct link
D1,3 = D3,1 = 2, because to go from 1 to 3, the shortest path is 1
-> 2 -> 3
Here is the graph associated to the above distance matrix:
I am looking for a better approach.

highest tower of cubes (with numbers on sides)

Problem:
There are N cubes. There are M numbers. Each side of cube has number from 1 to M. You can stack one cube on another if their touching sides have same number (top side of bottom cube and bottom side of top cube has same number). Find the highest tower of cubes.
Input: number N of cubes and number M.
Example:
INPUT: N=5, M=6. Now we generate 5 random cubes with 6 sides = <1,M>.
[2, 4, 3, 1, 4, 1]
[5, 1, 6, 6, 2, 5]
[2, 5, 3, 1, 1, 6]
[3, 5, 6, 1, 3, 4]
[2, 4, 4, 5, 5, 5]
how you interpret single array of 6 numbers is up to you. Opposite sides in cube might be index, 5-index (for first cube opposite side of 4 would be 4). Opposite sides in cube might also be index and index+1 or index-1 if index%2==0 or 1 respectively. I used the second one.
Now let's say first cube is our current tower. Depending on the rotation top color might be one of 1, 2, 3, 4. If the 1 is color on top we can stack
on top of it second, third or fourth cube. All of them has color 1 on their sides. Third cube even has two sides with color 1 so we can stack it in two different ways.
I won't analyse it till the end because this post would be too long. Final answer for these (max height of the tower) is 5.
My current solution (you can SKIP this part):
Now I'm just building the tower recursively. Each function has this subproblem to solve: find highest tower given the top color of current tower and current unused cubes (or current used cubes). This way I can memoize and store results for tuple(top color of tower, array of used cubes). Despite memoization I think that in the worst case (for small M) this solution has to store M*(2^N) values (and this many cases to solve).
What I'm looking for:
I'm looking for something that would help me solve this efficiently for small M. I know that there is tile stacking problem (which uses Dynamic Programming) and tower of cubes (which uses DAG longest path) but I don't see the applicability of these solutions to my problem.
You won't find a polynomial time solution- if you did, we'd be able to solve the decision variant of the longest path problem (which is NP-Complete) in polynomial time. The reduction is as follows: for every edge in an undirected graph G, create a cube with opposing faces (u, v), where u and v are unique identifiers for the vertices of the edge. For the remaining 4 faces, assign globally unique identifiers. Solve for the tallest cube tower, this tower's height will be the length of the longest path of G, return if path length equals the queried value (yes/no).
However, you could still solve it in something like O(M^3*(N/2)!*log(N)) time (I think that bound is a bit loose, but its close). Use divide and conquer with memoization. Find all longest paths using cubes [0, N) beginning with a value B in range [0, M) and ending with a value E in range [0, M), for all possible B and E. To compute this, recurse, partitioning the cubes evenly in every possible way. Keep recursing until you hit the bottom (just one cube). Then begin merging them (by combining cube stacks that end in X with those beginning with X, for all X in [0, M). Once that's all done, at the topmost level just take the max of all the tower heights.

Algorithms for distributing natural numbers in to equal piles

I'm looking for an algorithm that can take in a set of natural numbers, for example:
S = {1, 3, 4, 2, 9, 34, 432, 43}
Then divide them into as equal piles as possible. The number of piles are predefined as n.
The goal is to have the sum of the difference between each pile and the lowest pile, to be the smallest.
Here comes an example.
Let's say you have:
S = { 1, 2, 2, 3, 1, 2, 3 }
n = 3
Then a solution could be
N1 = { 1, 2 }
N2 = { 2, 3 }
N3 = { 1, 2, 3 }
The sum of these piles would be 3, 5 and 6. The error would be: (5 - 3) + (6 - 3) = 5.
The algorithm needs to find the solution with the lowest error.
Any help is appreciated. Please comment if something is unclear.
I would argue that there is no efficient way to solve this problem because it is a NP-hard problem.
Proof:
Let's denote the problem you proposed as P*,
We can reduce the partition problem(known NP-hard) into P* by doing the following
Given a arbitrary partition problem P1, we ask the black box which solve P* to solve P1 with N=2(i.e, divide the set into 2 pile that minimize the different).
If the difference return by the black box is zero, -> there is a solution for P1
If the difference return by the black box is non-zero, -> there isn't a solution for P1
Therefore, P* is NP-hard
This sounds like a variation of the https://en.m.wikipedia.org/wiki/Bin_packing_problem. However, the size of the bins is not given thus it is at least as hard as Bin Packing. Thus the problem is NP-hard.
For an approximate solution you could for example calculate the average bin size and perform an adaptation of first-fit or best-fit in order to allow small overpacking.

distance between nodes in floyd warshall

Thiswikipedia page explains the Floyd Warshall algorithm to find the shortest path between nodes in a graph. The wikipedia page uses the graph on the left of the image as a starting graph (prior to the first iteration when k = 0) and then shows the remaining iterations (k = 1 etc) but it doesn't explain the significance of the numbers between the nodes and how those numbers are calculated. For example, in the starting graph when k = 0 why is there a -2 on the edge between 1 and 3, and why is there a 3 on the edge between 2 and 3. How are those calculated?
Furthermore, when k = 2, the wikipedia page says,
The path [4,2,3] is not considered, because [2,1,3] is the shortest
path encountered so far from 2 to 3.
Why is [2,1,3] shorter than [4,2,3]?
The numbers on the edges are just weights. It's a part of the input. The algorithm doesn't compute them.
[2, 1, 3] is not shorter than [4, 2, 3]. It's shorter than [2, 3], though. That's the only thing that matters.

Finding the Reachability Count for all vertices of a DAG

I am trying to find a fast algorithm with modest space requirements to solve the following problem.
For each vertex of a DAG find the sum of its in-degree and out-degree in the DAG's transitive closure.
Given this DAG:
I expect the following result:
Vertex # Reacability Count Reachable Vertices in closure
7 5 (11, 8, 2, 9, 10)
5 4 (11, 2, 9, 10)
3 3 (8, 9, 10)
11 5 (7, 5, 2, 9, 10)
8 3 (7, 3, 9)
2 3 (7, 5, 11)
9 5 (7, 5, 11, 8, 3)
10 4 (7, 5, 11, 3)
It seems to me that this should be possible without actually constructing the transitive closure. I haven't been able to find anything on the net that exactly describes this problem. I've got some ideas about how to do this, but I wanted to see what the SO crowd could come up with.
For an exact answer, I think it's going to be hard to beat KennyTM's algorithm. If you're willing to settle for an approximation, then the tank counting method ( http://www.guardian.co.uk/world/2006/jul/20/secondworldwar.tvandradio ) may help.
Assign each vertex a random number in the range [0, 1). Use a linear-time dynamic program like polygenelubricants's to compute for each vertex v the minimum number minreach(v) reachable from v. Then estimate the number of vertices reachable from v as 1/minreach(v) - 1. For better accuracy, repeat several times and take a median of means at each vertex.
For each node, use BFS or DFS to find the out-reachability.
Do it again for the reversed direction to find the in-reachability.
Time complexity: O(MN + N2), space complexity: O(M + N).
I have constructed a viable solution to this question. I base my solution on a modification of the topological sorting algorithm. The algorithm below calculates only the in-degree in the transitive closure. The out-degree can be computed in the same fashion with edges reversed and the two counts for each vertex summed to determine the final "reachability count".
for each vertex V
inCount[V] = inDegree(V) // inDegree() is O(1)
if inCount[V] == 0
pending.addTail(V)
while pending not empty
process(pending.removeHead())
function process(V)
for each edge (V, V2)
predecessors[V2].add(predecessors[V]) // probably O(|predecessors[V]|)
predecessors[V2].add(V)
inCount[V2] -= 1
if inCount[V2] == 0
pending.add(V2)
count[V] = sizeof(predecessors[V]) // store final answer for V
predecessors[V] = EMPTY // save some memory
Assuming that the set operations are O(1), this algorithm runs in O(|V| + |E|). It is more likely, however, that the set union operation predecessors[V2].add(predecessors[V]) makes it somewhat worse. The additional steps required by the set unions depends on the shape of the DAG. I believe the worst case is O(|V|^2 + |E|). In my tests this algorithm has shown better performance than any other I have tried so far.
Furthermore, by disposing of predecessor sets for fully processed vertices, this algorithm will typically use less memory than most alternatives. It is true, however, that the worst case memory consumption of the above algorithm matches that of constructing the transitive closure, but that will not be true for most DAGs.
OMG IT'S WRONG! SORRY!
I'll leave this up until a good alternative is available. CW-ed so feel free to discuss and expand on this if possible.
Use dynamic programming.
for each vertex V
count[V] = UNKNOWN
for each vertex V
getCount(V)
function getCount(V)
if count[V] == UNKNOWN
count[V] = 0
for each edge (V, V2)
count[V] += getCount(V2) + 1
return count[V]
This is O(|V|+|E|) with adjacency list. It counts only the out-degree in the transitive closure. To count the in-degrees, call getCount with edges reversed. To get the sum, add up the counts from both calls.
To see why this is O(|V|+|E|), consider this: each vertex V will be visited exactly 1 + in-degree(V) times: once directly on V, and once for every edge (*, V). On subsequent visits, getCount(V), simply returns the memoized count[V] in O(1).
Another way to look at it is to count how many times each edge will be followed along: exactly once.
I assume that you have a list of all vertices, and that each vertex has an id and a list of vertices you can directly reach from it.
You can then add another field (or however you represent that) that holds the vertices you can also indirectly reach. I would do this in a recursive depth-first search, memoizing the results in the field of the respective reached nodes. As a data structure for this, you would perhaps use some sort of tree which allows efficient removal of duplicates.
The in-reachability can be done separately by adding the inverse links, but it can also be done in the same pass as the out-reachability, by accumulating the currently out-reaching nodes and adding them to the corresponding fields of the reached nodes.

Resources