What is the largest chromatic number of a binary tree? - binary-tree

This is a Discrete Math/Combinatorics Question from my homework, but I don't really understand the question.
Find largest chromatic number of a full binary tree given the following depths:
(Check all that apply)
2, 3, 7, 12, 200
I understand that the chromatic number refers to the minimum color that you can color a graph or tree with the adjacent nodes or vertices being different colors.
So knowing this fact, I'm sure that the chromatic number for all full binary trees should be 2 since you can use two different color nodes to complete the tree. But they want me to find the largest chromatic number, which confuses me.
Am I missing something?

All trees are bipartite graphs and therefore 2-colorable. One way to see this is that a graph is bipartite iff it has no odd cycles, and since trees have no cycles at all, they must be bipartite. Therefore, two colors suffice regardless of the tree size.
Hope this helps!

Related

How to build a Minimum Spanning Tree given a list of 200 000 nodes?

Problem
I have a list of approximatly 200000 nodes that represent lat/lon position in a city and I have to compute the Minimum Spanning Tree. I know that I need to use Prim algorithm but first of all I need a connected graph. (We can assume that those nodes are in a Euclidian plan)
To build this connected graph I thought firstly to compute the complete graph but (205000*(205000-1)/2 is around 19 billions edges and I can't handle that.
Options
Then I came across to Delaunay triangulation: with the fact that if I build this "Delauney graph", it contains a sub graph that is the Minimum Spanning Tree according and I have a total of around 600000 edges according to Wikipedia [..]it has at most 3n-6 edges. So it may be a good starting point for a Minimum Spanning Tree algorithm.
Another options is to build an approximately connected graph but with that I will maybe miss important edges that will influence my Minimum Spanning Tree.
My question
Is Delaunay a reliable solution in this case? If so, is there any other reliable solution than delaunay triangulation to this problem ?
Further information: this problem has to be solved in C.
The Delaunay triangulation of a point set is always a superset of the EMST of these points. So it is absolutely "reliable"*. And recommended, as it has a size linear in the number of points and can be efficiently built.
*When there are cocircular point quadruples, neither the triangulation nor the EMST are uniquely defined, but this is usually harmless.
There's a big question here of what libraries you have access to and how much you trust yourself as a coder. (I'm assuming the fact that you're new on SO should not be taken as a measure of your overall experience as a programmer - if it is, well, RIP.)
If we assume you don't have access to Delaunay and can't implement it yourself, minimum spanning trees algorithms that pre-suppose a graph aren't necessarily off limits to you. You can have the complete graph conceptually but not actually. Kruskal's algorithm, for instance, assumes you have a sorted list of all edges in your graph; most of your edges will not be near the minimum, and you do not have to compare all n^2 to find the minimum.
You can find minimum edges quickly by estimations that give you a reduced set, then refinement. For instance, if you divide your graph into a 100*100 grid, for any point p in the graph, points in the same grid square as p are guaranteed to be closer than points three or more squares away. This gives a much smaller set of points you have to compare to safely know you've found the closest.
It still won't be easy, but that might be easier than Delaunay.

Maximal number of vertex pairs in undirected not weighted graph

Given undirected not weighted graph with any type of connectivity, i.e. it can contain from 1 to several components with or without single nodes, each node can have 0 to many connections, cycles are allowed (but no loops from node to itself).
I need to find the maximal amount of vertex pairs assuming that each vertex can be used only once, ex. if graph has nodes 1,2,3 and node 3 is connected to nodes 1 and 2, the answer is one (1-3 or 2-3).
I am thinking about the following approach:
Remove all single nodes.
Find the edge connected a node with minimal number of edges to node with maximal number of edges (if there are several - take any of them), count and remove this pair of nodes from graph.
Repeat step 2 while graph has connected nodes.
My questions are:
Does it provide maximal number of pairs for any case? I am
worrying about some extremes, like cycles connected with some
single or several paths, etc.
Is there any faster and correct algorithm?
I can use java or python, but pseudocode or just algo description is perfectly fine.
Your approach is not guaranteed to provide the maximum number of vertex pairs even in the case of a cycle-free graph. For example, in the following graph your approach is going to select the edge (B,C). After that unfortunate choice, there are no more vertex pairs to choose from, and therefore you'll end up with a solution of size 1. Clearly, the optimal solution contains two vertex pairs, and hence your approach is not optimal.
The problem you're trying to solve is the Maximum Matching Problem (not to be confused with the Maximal Matching Problem which is trivial to solve):
Find the largest subset of edges S such that no vertex is incident to more than one edge in S.
The Blossom Algorithm solves this problem in O(EV^2).
The way the algorithm works is not straightforward and it introduces nontrivial notions (like a contracted matching, forest expansions and blossoms) to establish the optimal matching. If you just want to use the algorithm without fully understanding its intricacies you can find ready-to-use implementations of it online (such as this Python implementation).

Minimum vertex cover

I am trying to get a vertex cover for an "almost" tree with 50,000 vertices. The graph is generated as a tree with random edges added in making it "almost" a tree.
I used the approximation method where you marry two vertices, add them to the cover and remove them from the graph, then move on to another set of vertices. After that I tried to reduce the number of vertices by removing the vertices that have all of their neighbors inside the vertex cover.
My question is how would I make the vertex cover even smaller? I'm trying to go as low as I can.
Here's an idea, but I have no idea if it is an improvement in practice:
From https://en.wikipedia.org/wiki/Biconnected_component "Any connected graph decomposes into a tree of biconnected components called the block-cut tree of the graph." Furthermore, you can compute such a decomposition in linear time.
I suggest that when you marry and remove two vertices you do this only for two vertices within the same biconnected component. When you have run out of vertices to merge you will have a set of trees not connected with each other. The vertex cover problem on trees is tractable via dynamic programming: for each node compute the cost of the best answer if that node is added to the cover and if that node is not added to the cover. You can compute the answers for a node given the best answers for its children.
Another way - for all I know better - would be to compute the minimum spanning tree of the graph and to use dynamic programming to compute the best vertex cover for that tree, neglecting the links outside the tree, remove the covered links from the graph, and then continue by marrying vertices as before.
I think I prefer the minimum spanning tree one. In producing the minimum spanning tree you are deleting a small number of links. A tree with N nodes had N-1 links, so even if you don't get back the original tree you get back one with as many links as it. A vertex cover for the complete graph is also a vertex cover for the minimum spanning tree so if the correct answer for the full graph has V vertices there is an answer for the minimum spanning tree with at most V vertices. If there were k random edges added to the tree there are k edges (not necessarily the same) that need to be added to turn the minimum spanning tree into the full graph. You can certainly make sure these new edges are covered with at most k vertices. So if the optimum answer has V vertices you will obtain an answer with at most V+k vertices.
Here's an attempt at an exact answer which is tractable when only a small number of links are added, or when they don't change the inter-node distances very much.
Find a minimum spanning tree, and divide edges into "tree edges" and "added edges", where the tree edges form a minimum spanning tree, and the added edges were not chosen for this. They may not be the edges actually added during construction but that doesn't matter. All trees on N nodes have N-1 edges so we have the same number of added edges as were used during creation, even if not the same edges.
Now pretend you can peek at the answer in the back of the book just enough to see, for one vertex from each added edge, whether that vertex was part of the best vertex cover. If it was, you can remove that vertex and its links from the problem. If not, the other vertex must be so you can remove it and its links from the problem.
You now have to find a minimum vertex cover for a tree or a number of disconnected trees, and we know how to do this - see my other answer for a bit more handwaving.
If you can't peek at the back of the book for an answer, and there are k added edges, try all 2^k possible answers that might have been in the back of the book and find the best. If you are lucky then added link A is in a different subtree from added link B. In that case you can confine the two calculations needed for the two possibilities for added link A (or B) to the dynamic programming calculations for the relevant subtree so you have only doubled the work instead of quadrupled it. In general, if your k added edges are in k different subtrees that don't interfere with each other, the cost is multiplied by 2 instead of 2^k.
Minimum vertex cover is an NP complete algorithm, which means that you can not solve it in a reasonable time even for something like 100 vertices (not to mention 50k).
For a tree there is a polynomial time greedy algorithm which is based on DFS, but the fact that you have "random edges added" screws everything up and makes this algorithm useless.
Wikipedia has an article about approximation algorithm, claims that it reaches factor 2 and claims that no better algorithm is know, which makes it quit unlikely that you will find one.

Rearranging a graph so that certain nodes are not adjacent?

EDIT: Precisely, I am trying to find two disjoint independent sets of known size in a graph shaped like a triangular grid, which may have holes and has a variable perimeter shape.
I'm not very well versed in graph theory, so I'm not sure if there exists an efficient solution for this problem. Consider the following graphs:
The colors of any two nodes can be swapped. The goal is to ensure that no two red nodes are adjacent, and no two green nodes are adjacent. The edges marked with exclamation points are invalid. Basically, I need to write two algorithms:
Determine that the nodes in a given graph can be arranged so that red and green nodes are not adjacent to nodes of the same color.
Actually rearrange the nodes.
I'm a little lost on how to implement this. It's not too difficult to separate the nodes of one color, but repeating the process for the second color may mess up the first color. Without a way to determine whether the graph can actually be arranged properly, this process could loop forever.
Is there some kind of algorithm that I can use/write for this? I'm mainly interested in the first image's graph (a triangular grid), but a generic algorithm would work as well.
First, let's note that the problem is a variant of graph coloring.
Now, if you only dealing with 2 colors (red,green) - coloring a graph with 2 colors is fairly easy, and is basically done by finding out if the graph is bipartite, and coloring each "side" of the graph in one color. Finding if a graph is bipartite is fairly simple.
However, if you want more than two colors, the problem becomes NP-Complete, and is actually a variant of the Graph Coloring Problem.
Graph Coloring Problem:
Given a graph G=(V,E) and a number k determine if there is a
function c:V->{1,2.,,,.k} such that c(v) = v(u) -> (v,u) is not an
edge.
Informally, you can color the graph in k colors, and you need to determine if there is some coloring such that you never color 2 nodes that share an edge with the same color.
Note that while it seems your problem is slightly easier, since you already know what is the number of nodes in each color, it doesn't really make a difference.
Assume you have a polynomial time algorithm A that solves your problem.
Now, given an instance (G,k) of graph coloring - there are only O(n^3) possibilities to #color1,#color2,#color3 - so by examining each of these and invoking A on it, you can find a polynomial time solution to Graph-Coloring. This will mean P=NP, which is most likely (according to most CS researchers) not the case.
tl;dr:
For 2 colors: find out if the graph is bipartite - and give one color to each side of the graph.
For 3 or more colors: There is no known efficient solution, and the general belief is one does not exist.
I thought this problem would be easier for planar graph, but unfortunately it's not the case. Best match for this problem I was able to find is minimum sum coloring and largest bipartite subgraph.
For largest bipartite subgraph, assume that number of reds + number of greens exactly match the size of largest bipartite subgraph. Then this problem is equivalent to your. Paper claims that it's still NP-hard even for planar graphs.
For minimum sum coloring, assume that red color has weight 1, green color has color 2, and we have infinitely many blue* colors with some weight of >graph size. Then if answer is exactly minimal sum coloring, there is no polynomial algorithm to find it (although paper referes to such algorithm for chordal graphs).
Anyway, it seems that the closer your red+green count to the 'optimal' in some sense subgraph, the more difficult problem is.
If you can afford inexact solution, or relaxed solution then you only spearate, say, reds, you have an option. As I said in comment, approximate solution of maximum independent set problem for planar graph. Then color this set into red and blue colors, if it is large enough.
If you know that red+green is much less than total number of vertices, another approximation can work. Look at introduction chapter of this article. It claims that:
For graphs which are promised to have small chromatic number, a better
guarantee is known: given a k-colorable graph, an algorithm due to
Karger et al. [12] uses semidefinite programming (SDP) to find an
independent set of size about Ω(n/∆^(1−2/k)).
Your graph is for sure 4-colorable, so you can count on large enough independent set. The same article states that greedy solution already can find large enough independent set.

How can the shortest path between two nodes in a balanced binary tree be affected by path 'weight'?

I am following an online introductory algorithms course with Udacity.
In the final assessment there is a question as follows:
In the shortest-path oracle described in Andrew Goldberg's interview,
each node has a label, which is a list of some other nodes in the
network and their distance to these nodes. These lists have the
property that:
(1) for any pair of nodes (x,y) in the network, their lists will have
at least one node z in common
(2) the shortest path from x to y will go through z. Given a graph G
that is a balanced binary tree, preprocess the graph to create such
labels for each node. Note that the size of the list in each label
should not be larger than log n for a graph of size n.
The full question can be found here.
Given the constraint of a balanced binary tree and the hint that the size should not be larger than log n, intuitively it seems that the label for a particular node would consist of all its parents (and optionally itself, if it isn't a leaf).
However some additional instructor notes in the question adds:
Write your solution to work on weighted graphs. Note that the test
given, all the edges have a weight of one - which isn't particularly
interesting.
So my question is:
How can the shortest path between two nodes in a binary tree be affected by whether the paths have weights or not?
Surely in a binary tree, the shortest path between two nodes is the unique simple path, and is unaffected by any weighting?
(unless weights can be negative and the path doesn't have to be simple in which case there is no shortest path?)
My basic solution works with the simple test provided in the question, but fails to pass the automatic grader which gives no feedback.
I'm obviously misunderstanding something, but what...
Ok, so I think my initial reaction and the obvious answer is correct:
Positive weights cannot affect the shortest path between two nodes in a binary tree.
On the other hand, weights obviously do affect the shortest 'distance' between two nodes in a binary tree as compared to simply calculating 'distance' between nodes as the number of hops.
This is what the udacity instruction was getting at.
It seems that this instruction to work with weighted binary trees was simply to enable correct automatic grading of the code which relied on using the labels to calculate the exact shortest 'distance' (which is affected by weight) as opposed to the shortest path (list of nodes) which is not.
Once I modified my algorithm to take this into account and output the correct distance, it passed the grader.

Resources