Graph representation with a value in each vertex - algorithm

I know how to represent a graph with an adjacency list representation and I also know the matrix representation (Reference: Algorithms Design Manual)
The adjacency list representation is simply:
struct edge {
int y;
int weight;
struct edge *next;
};
struct graph {
int n;
struct edge *edges[N];
}
But now I want to put the values in the vertex while the edges have no values
struct vertex {
int value; // to be used for sums later
struct vertex *parent;
struct vertex *child;
}
struct edge {
struct vertex *start;
struct vertex *end;
}
struct graph {
int n;
struct vertex *v[N]; // array of vertices
// How do I link the vertices and the edges?
// struct edge
}
My question is how do I link the vertices with the edges?

struct graph {
int n_nodes;
struct node {
int value;
int n_adjacents;
int *adjacent_indices;
} **nodes;
} graph;

Related

Algorithm that finds a simple cycle in the graph and prints it

Let G=(V,E) be a simple undirected graph. Suggest an algorithm that finds some simple cycle in the graph and prints it (the sequence of nodes composing it). If there is no such cycle, the algorithm will not print anything.
Algorithm:
Initiate an array of size n, and a parent variable for each vertex.
Start DFS on a random vertex, and for each visited vertex, mark "1" in the array, and assign its parent node.
If in the DFS run, the next vertex is an already marked vertex which is not its parent - there is a cycle in the graph, and print backwards all of the nodes using their parent variable.
Is the algorithm correct? Or do I need to change things?
Thanks!
From the graph theory we know that:
if the quantity of vertices of a graph is more than the quantity of edges, therefore, cycles (closed contours) are absent.
if the quantity of vertices of a graph is equal to the quantity of edges, therefore, a graph has only one cycle.
if the quantity of vertices of a graph is less than the quantity of edges, therefore, a graph has more than one closed contour.
I offer the algorithm Depth first search, that finds a simple cycles in the graph and prints them:
#include <iostream>
#include <vector>
#include <set>
#include <algorithm>
using namespace std;
const int maximumSize=40;
vector<vector<int>> visited(maximumSize, vector<int>(maximumSize, 0));
vector<int> graph[maximumSize], closedContour, temporary;
int vertices, edges;
set<vector<int>> contours;
void showContentSetVector(set<vector<int>> input)
{
for(auto iterator=input.begin(); iterator!=input.end(); ++iterator)
{
for(auto item : *iterator)
{
cout<<item<<", ";
}
cout<<endl;
}
return;
}
bool compare(int i,int j)
{
return (i<j);
}
void createGraph()
{
cin>>vertices>>edges;
int vertex0, vertex1;
for(int i=1; i<=edges; ++i)
{
cin>>vertex0>>vertex1;
graph[vertex0].push_back(vertex1);
graph[vertex1].push_back(vertex0);
}
return;
}
void depthFirstSearch(int initial, int current, int previous)
{
if(visited[initial][current]==1)
{
for(int i=0; i<temporary.size(); ++i)
{
if(temporary[i]==current)
{
for(int j=i; j<temporary.size(); ++j)
{
closedContour.push_back(temporary[j]);
}
}
}
sort(closedContour.begin(), closedContour.end(), compare);
contours.insert(closedContour);
closedContour.clear();
return;
}
visited[initial][current]=1;
temporary.push_back(current);
for(int next : graph[current])
{
if(next==previous)
{
continue;
}
depthFirstSearch(initial, next, current);
}
temporary.pop_back();
return;
}
void solve()
{
createGraph();
for(int vertex=1; vertex<=vertices; ++vertex)
{
temporary.clear();
depthFirstSearch(vertex, vertex, -1);
}
cout<<"contours <- ";
showContentSetVector(contours);
return;
}
int main()
{
solve();
return 0;
}
Here is the result:
contours <-
1, 2, 3, 4,
6, 7, 8,

How to implement range search in KD-Tree

I have built a d dimensional KD-Tree. I want to do range search on this tree. Wikipedia mentions range search in KD-Trees, but doesn't talk about implementation/algorithm in any way. Can someone please help me with this? If not for any arbitrary d, any help for at least for d = 2 and d = 3 would be great. Thanks!
There are multiple variants of kd-tree. The one I used had the following specs:
Each internal node has max two nodes.
Each leaf node can have max maxCapacity points.
No internal node stores any points.
Side note: there are also versions where each node (irrespective of whether its internal or leaf) stores exactly one point. The algorithm below can be tweaked for those too. Its mainly the buildTree where the key difference lies.
I wrote an algorithm for this some 2 years back, thanks to the resource pointed to by #9mat .
Suppose the task is to find the number of points which lie in a given hyper-rectangle ("d" dimensions). This task can also be to list all points OR all points which lie in given range and satisfy some other criteria etc, but that can be a straightforward change to my code.
Define a base node class as:
template <typename T> class kdNode{
public: kdNode(){}
virtual long rangeQuery(const T* q_min, const T* q_max) const{ return 0; }
};
Then, an internal node (non-leaf node) can look like this:
class internalNode:public kdNode<T>{
const kdNode<T> *left = nullptr, *right = nullptr; // left and right sub trees
int axis; // the axis on which split of points is being done
T value; // the value based on which points are being split
public: internalNode(){}
void buildTree(...){
// builds the tree recursively
}
// returns the number of points in this sub tree that lie inside the hyper rectangle formed by q_min and q_max
int rangeQuery(const T* q_min, const T* q_max) const{
// num of points that satisfy range query conditions
int rangeCount = 0;
// check for left node
if(q_min[axis] <= value) {
rangeCount += left->rangeQuery(q_min, q_max);
}
// check for right node
if(q_max[axis] >= value) {
rangeCount += right->rangeQuery(q_min, q_max);
}
return rangeCount;
}
};
Finally, the leaf node would look like:
class leaf:public kdNode<T>{
// maxCapacity is a hyper - param, the max num of points you allow a node to hold
array<T, d> points[maxCapacity];
int keyCount = 0; // this is the actual num of points in this leaf (keyCount <= maxCapacity)
public: leaf(){}
public: void addPoint(const T* p){
// add a point p to the leaf node
}
// check if points[index] lies inside the hyper rectangle formed by q_min and q_max
inline bool containsPoint(const int index, const T* q_min, const T* q_max) const{
for (int i=0; i<d; i++) {
if (points[index][i] > q_max[i] || points[index][i] < q_min[i]) {
return false;
}
}
return true;
}
// returns number of points in this leaf node that lie inside the hyper rectangle formed by q_min and q_max
int rangeQuery(const T* q_min, const T* q_max) const{
// num of points that satisfy range query conditions
int rangeCount = 0;
for(int i=0; i < this->keyCount; i++) {
if(containsPoint(i, q_min, q_max)) {
rangeCount++;
}
}
return rangeCount;
}
};
In the code for range query inside the leaf node, it is also possible to do a "binary search" inside of "linear search". Since the points will be sorted along on the axis axis, you can do a binary search do find l and r values using q_min and q_max, and then do a linear search from l to r instead of 0 to keyCount-1 (of course in the worst case it wont help, but practically, and especially if you have a capacity of pretty high values, this may help).
This is my solution for a KD-tree, where each node stores points (so not just the leafs). (Note that adapting for where points are stored only in the leafs is really easy).
I leaf some of the optimizations out and will explain them at the end, this to reduce the complexity of the solution.
The get_range function has varargs at the end, and can be called like,
x1, y1, x2, y2 or
x1, y1, z1, x2, y2, z2 etc. Where first the low values of the range are given and then the high values.
(You can use as many dimensions as you like).
static public <T> void get_range(K_D_Tree<T> tree, List<T> result, float... range) {
if (tree.root == null) return;
float[] node_region = new float[tree.DIMENSIONS * 2];
for (int i = 0; i < tree.DIMENSIONS; i++) {
node_region[i] = -Float.MAX_VALUE;
node_region[i+tree.DIMENSIONS] = Float.MAX_VALUE;
}
_get_range(tree, result, tree.root, node_region, 0, range);
}
The node_region represents the region of the node, we start as large as possible. Cause for all we know this could be the region we are dealing with.
Here the recursive _get_range implementation:
static public <T> void _get_range(K_D_Tree<T> tree, List<T> result, K_D_Tree_Node<T> node, float[] node_region, int dimension, float[] target_region) {
if (dimension == tree.DIMENSIONS) dimension = 0;
if (_contains_region(tree, node_region, target_region)) {
_add_whole_branch(node, result);
}
else {
float value = _value(tree, dimension, node);
if (node.left != null) {
float[] node_region_left = new float[tree.DIMENSIONS*2];
System.arraycopy(node_region, 0, node_region_left, 0, node_region.length);
node_region_left[dimension + tree.DIMENSIONS] = value;
if (_intersects_region(tree, node_region_left, target_region)){
_get_range(tree, result, node.left, node_region_left, dimension+1, target_region);
}
}
if (node.right != null) {
float[] node_region_right = new float[tree.DIMENSIONS*2];
System.arraycopy(node_region, 0, node_region_right, 0, node_region.length);
node_region_right[dimension] = value;
if (_intersects_region(tree, node_region_right, target_region)){
_get_range(tree, result, node.right, node_region_right, dimension+1, target_region);
}
}
if (_region_contains_node(tree, target_region, node)) {
result.add(node.point);
}
}
}
One important thing that the other answer does not provide is this part:
if (_contains_region(tree, node_region, target_region)) {
_add_whole_branch(node, result);
}
With a range search for a KD-Tree you have 3 options for a node's region, it's:
fully outside
it intersects
it's fully contained
Once you know a region is fully contained, then you can add the whole branch without doing any dimension checks.
To make it more clear, here is the _add_whole_branch:
static public <T> void _add_whole_branch(K_D_Tree_Node<T> node, List<T> result) {
result.add(node.point);
if (node.left != null) _add_whole_branch(node.left, result);
if (node.right != null) _add_whole_branch(node.right, result);
}
In this image, all the big white dots where added using _add_whole_branch and only for the red dots a check for all dimensions had to be done.
Optimization
1)
Instead of starting with the root node for the _get_range function, instead you can find the split node. This is the first node that has it's point within the query range. To find the split node you will still need to start at the root node, but the calculations are a bit cheaper (cause you go either left or right till).
2)
Now I create the float[] node_region_left and float[] node_region_right, and since this happens in a recursive function it can lead to quite some arrays. However, you can reuse the one for the left for the right. I didn't do it in this example for clarity reasons.
I can also imagine storing the region size in the node, but this takes quite some more memory and might lead to a lot of cache misses.

Finding all non-comparable nodes in DAG

I am interested in finding sets of vertices that are not ordered in a directed acyclic graph (in the sense of a topological order).
That is, for example: two vertices in non-connected subgraphs, or the pairs (B,C), (B,D) in cases such as :
The naive possibility I thought of was to enumerate all the topological sorts (in this case [ A, B, C, D ] and [ A, C, D, B ] & find all pairs whose order ends up being different in at least two sorts, but this would be pretty expensive computationally.
Are there other, faster possibilities for what I want to achieve ? I am using boost.graph.
Basically what you want is the pair of nodes (u,v) such that there is no path from u to v, and no path from v to u. You can find for each node, all nodes that are reachable from that node using DFS. Total Complexity O(n(n+m)).
Now all you have to do is for each pair check if neither of the 2 nodes are reachable by the other.
You can start with a simple topological sort. Boost's implementation conveniently returns a reverse ordered list of vertices.
You can iterate that list, marking each initial leaf node with a new branch id until a shared node is encountered.
Demo Time
Let's start with the simplests of graph models:
#include <boost/graph/adjacency_list.hpp>
using Graph = boost::adjacency_list<>;
We wish to map branches:
using BranchID = int;
using BranchMap = std::vector<BranchID>; // maps vertex id -> branch id
We want to build, map and visualize the mappings:
Graph build();
BranchMap map_branches(Graph const&);
void visualize(Graph const&, BranchMap const& branch_map);
int main() {
// sample data
Graph g = build();
// do the topo sort and distinguish branches
BranchMap mappings = map_branches(g);
// output
visualize(g, mappings);
}
Building Graph
Just the sample data from the question:
Graph build() {
Graph g(4);
enum {A,B,C,D};
add_edge(A, B, g);
add_edge(A, C, g);
add_edge(C, D, g);
return g;
}
Mapping The Branches
As described in the introduction:
#include <boost/graph/topological_sort.hpp>
std::vector<BranchID> map_branches(Graph const& g) {
std::vector<Vertex> reverse_topo;
boost::topological_sort(g, back_inserter(reverse_topo));
// traverse the output to map to unique branch ids
std::vector<BranchID> branch_map(num_vertices(g));
BranchID branch_id = 0;
for (auto v : reverse_topo) {
auto degree = out_degree(v, g);
if (0 == degree) // is leaf?
++branch_id;
if (degree < 2) // "unique" path
branch_map[v] = branch_id;
}
return branch_map;
}
Visualizing
Let's write a graph-viz representation with each branch colored:
#include <boost/graph/graphviz.hpp>
#include <iostream>
void visualize(Graph const& g, BranchMap const& branch_map) {
// display helpers
std::vector<std::string> const colors { "gray", "red", "green", "blue" };
auto name = [](Vertex v) -> char { return 'A'+v; };
auto color = [&](Vertex v) -> std::string { return colors[branch_map.at(v) % colors.size()]; };
// write graphviz:
boost::dynamic_properties dp;
dp.property("node_id", transform(name));
dp.property("color", transform(color));
write_graphviz_dp(std::cout, g, dp);
}
This uses a tiny shorthand helper to create the transforming property maps:
// convenience short-hand to write transformed property maps
template <typename F>
static auto transform(F f) { return boost::make_transform_value_property_map(f, boost::identity_property_map{}); };
To compile this on a non-c++14 compiler you can replace the call to transform with the expanded body
Full Listing
Live On Coliru
#include <boost/graph/adjacency_list.hpp>
using Graph = boost::adjacency_list<>;
using BranchID = int;
using BranchMap = std::vector<BranchID>; // maps vertex id -> branch id
Graph build();
BranchMap map_branches(Graph const&);
void visualize(Graph const&, BranchMap const& branch_map);
int main() {
// sample data
Graph g = build();
// do the topo sort and distinguish branches
BranchMap mappings = map_branches(g);
// output
visualize(g, mappings);
}
using Vertex = Graph::vertex_descriptor;
Graph build() {
Graph g(4);
enum {A,B,C,D};
add_edge(A, B, g);
add_edge(A, C, g);
add_edge(C, D, g);
return g;
}
#include <boost/graph/topological_sort.hpp>
std::vector<BranchID> map_branches(Graph const& g) {
std::vector<Vertex> reverse_topo;
boost::topological_sort(g, back_inserter(reverse_topo));
// traverse the output to map to unique branch ids
std::vector<BranchID> branch_map(num_vertices(g));
BranchID branch_id = 0;
for (auto v : reverse_topo) {
auto degree = out_degree(v, g);
if (0 == degree) // is leaf?
++branch_id;
if (degree < 2) // "unique" path
branch_map[v] = branch_id;
}
return branch_map;
}
#include <boost/property_map/transform_value_property_map.hpp>
// convenience short-hand to write transformed property maps
template <typename F>
static auto transform(F f) { return boost::make_transform_value_property_map(f, boost::identity_property_map{}); };
#include <boost/graph/graphviz.hpp>
#include <iostream>
void visualize(Graph const& g, BranchMap const& branch_map) {
// display helpers
std::vector<std::string> const colors { "gray", "red", "green", "blue" };
auto name = [](Vertex v) -> char { return 'A'+v; };
auto color = [&](Vertex v) -> std::string { return colors[branch_map.at(v) % colors.size()]; };
// write graphviz:
boost::dynamic_properties dp;
dp.property("node_id", transform(name));
dp.property("color", transform(color));
write_graphviz_dp(std::cout, g, dp);
}
Printing
digraph G {
A [color=gray];
B [color=red];
C [color=green];
D [color=green];
A->B ;
A->C ;
C->D ;
}
And the rendered graph:
Summary
Nodes in branches with different colors cannot be compared.

Populate a tree from vectors with BGL

I have two vectors of objects that I need to make a tree structure from them. Let's assume we have vector <obj> parents and vector <obj> leaves. Therefore, each element of vector <obj> parents has several leaves that sits at the end of the tree. What I am doing is defining Vertex properties and Edges properties as below, and then define a bidirectional graph:
struct VertexData
{
std::string obj_name; // concatenation of labels
std::string obj_class_num;
int num;
vector <int> segments_list;
bool is_leaf=false;
};
struct EdgeData
{
std::string edge_name;
double confidence;
};
typedef boost::adjacency_list<boost::vecS, boost::vecS,
boost::bidirectionalS,
VertexData,
boost::property<boost::edge_weight_t, double, EdgeData> > Graph;
Graph graph;
First approach: looping through the vector <obj> leaves, for each member, I find the parent and make an edge. Then assign properties to the edge and vertices. But then for next leaf, I should check if already it has a parent in the tree or I should add a new vertex for its parent.
Second approach: another thing that I tried, was looping through the vector <obj> parents, and for each element try to make its leaves. But I am not sure what is the correct way to do this.
Here is a link:
adding custom vertices to a boost graph that I try to do the same but with iterations.
Code added for 1st approach:
vector <class1> parents; // this has some objects of type class1
vector <class2> leaves; // this has some objects of type class2
/// declare the graph
typedef boost::adjacency_list<boost::vecS, boost::vecS,
boost::bidirectionalS,
VertexData,
boost::property<boost::edge_weight_t, double, EdgeData> > Graph;
/// instantiate the graph
Graph graph;
typedef boost::graph_traits<Graph>::vertex_descriptor vertex_t;
typedef boost::graph_traits<Graph>::edge_descriptor edge_t;
vector<vertex_t> obj_vertices;
vector<string> parents_labels_v;
bool parent_exist=false;
/// loop through leaves and make edges with associated parent
for (auto leaf: leaves) {
int leaf_nr = leaf.Number;
vertex_t v = boost::add_vertex(graph); // this is the leaf vertex
graph[v].num = leaf_nr; // leaf number
graph[v].is_leaf = true;
/// access the parent label by leaf number
string label1 = parents[leaf_nr].label;
/// check if the parent already exist, using its label
if(std::find(parents_labels_v.begin(), parents_labels_v.end(), label1)
!= parents_labels_v.end()){
parent_exist = true;
}else{
parents_labels_v.push_back(label1);
}
if(parent_exist) {
// find already_exist parent vertex to make the edge
vertex_t u = ??? here i have problem
// Create an edge connecting those two vertices
edge_t e; bool b;
boost::tie(e,b) = boost::add_edge(u,v,graph);
} else{
// if parent-vertex there is not, add it to the graph
vertex_t u = boost::add_vertex(graph); // this is the parent vertex
graph[u].obj_name = label1;
graph[u].segments_list.push_back(leaf_nr);
obj_vertices.push_back(u);
// Create an edge connecting those two vertices
edge_t e; bool b;
boost::tie(e,b) = boost::add_edge(u,v,graph);
}
}

C++11: Algorithm & data structure separation

I have the following basic class structure:
class Distance : public Base {
public:
using Base::Base;
void run(int u, int v); // indices for nodes in graph
void runAll();
};
and
class Base {
protected:
const Graph& G;
Matrix& results;
public:
explicit Base(const Graph& G, Matrix& results);
virtual ~Base() = default;
virtual void run(int u, int v) = 0;
virtual void runAll() = 0;
double getResult(int u, int v) const;
const Matrix& getResults() const;
};
where Distance basically calculates either for a node-pair (u, v) or for all possible node-pairs a distance-score which gets stored in the results Matrix.
The problem right now is the separation of the algorithm and the data structure. If I want to reuse the run-method of Distance in a second class AnotherDistance derived from Base I have to allocated the matrix 2 times (1 in Distance and 1 in AnotherDistance) which is not feasible in my case as the matrix could amount to multiple GB.
What would be the best approach to solve this? I could take the results-Matrix as an argument in the constructor (probably bad design) or maybe use move-semantics in another separate getter which would empty the matrix in the class.

Resources