How to implement range search in KD-Tree - algorithm

I have built a d dimensional KD-Tree. I want to do range search on this tree. Wikipedia mentions range search in KD-Trees, but doesn't talk about implementation/algorithm in any way. Can someone please help me with this? If not for any arbitrary d, any help for at least for d = 2 and d = 3 would be great. Thanks!

There are multiple variants of kd-tree. The one I used had the following specs:
Each internal node has max two nodes.
Each leaf node can have max maxCapacity points.
No internal node stores any points.
Side note: there are also versions where each node (irrespective of whether its internal or leaf) stores exactly one point. The algorithm below can be tweaked for those too. Its mainly the buildTree where the key difference lies.
I wrote an algorithm for this some 2 years back, thanks to the resource pointed to by #9mat .
Suppose the task is to find the number of points which lie in a given hyper-rectangle ("d" dimensions). This task can also be to list all points OR all points which lie in given range and satisfy some other criteria etc, but that can be a straightforward change to my code.
Define a base node class as:
template <typename T> class kdNode{
public: kdNode(){}
virtual long rangeQuery(const T* q_min, const T* q_max) const{ return 0; }
};
Then, an internal node (non-leaf node) can look like this:
class internalNode:public kdNode<T>{
const kdNode<T> *left = nullptr, *right = nullptr; // left and right sub trees
int axis; // the axis on which split of points is being done
T value; // the value based on which points are being split
public: internalNode(){}
void buildTree(...){
// builds the tree recursively
}
// returns the number of points in this sub tree that lie inside the hyper rectangle formed by q_min and q_max
int rangeQuery(const T* q_min, const T* q_max) const{
// num of points that satisfy range query conditions
int rangeCount = 0;
// check for left node
if(q_min[axis] <= value) {
rangeCount += left->rangeQuery(q_min, q_max);
}
// check for right node
if(q_max[axis] >= value) {
rangeCount += right->rangeQuery(q_min, q_max);
}
return rangeCount;
}
};
Finally, the leaf node would look like:
class leaf:public kdNode<T>{
// maxCapacity is a hyper - param, the max num of points you allow a node to hold
array<T, d> points[maxCapacity];
int keyCount = 0; // this is the actual num of points in this leaf (keyCount <= maxCapacity)
public: leaf(){}
public: void addPoint(const T* p){
// add a point p to the leaf node
}
// check if points[index] lies inside the hyper rectangle formed by q_min and q_max
inline bool containsPoint(const int index, const T* q_min, const T* q_max) const{
for (int i=0; i<d; i++) {
if (points[index][i] > q_max[i] || points[index][i] < q_min[i]) {
return false;
}
}
return true;
}
// returns number of points in this leaf node that lie inside the hyper rectangle formed by q_min and q_max
int rangeQuery(const T* q_min, const T* q_max) const{
// num of points that satisfy range query conditions
int rangeCount = 0;
for(int i=0; i < this->keyCount; i++) {
if(containsPoint(i, q_min, q_max)) {
rangeCount++;
}
}
return rangeCount;
}
};
In the code for range query inside the leaf node, it is also possible to do a "binary search" inside of "linear search". Since the points will be sorted along on the axis axis, you can do a binary search do find l and r values using q_min and q_max, and then do a linear search from l to r instead of 0 to keyCount-1 (of course in the worst case it wont help, but practically, and especially if you have a capacity of pretty high values, this may help).

This is my solution for a KD-tree, where each node stores points (so not just the leafs). (Note that adapting for where points are stored only in the leafs is really easy).
I leaf some of the optimizations out and will explain them at the end, this to reduce the complexity of the solution.
The get_range function has varargs at the end, and can be called like,
x1, y1, x2, y2 or
x1, y1, z1, x2, y2, z2 etc. Where first the low values of the range are given and then the high values.
(You can use as many dimensions as you like).
static public <T> void get_range(K_D_Tree<T> tree, List<T> result, float... range) {
if (tree.root == null) return;
float[] node_region = new float[tree.DIMENSIONS * 2];
for (int i = 0; i < tree.DIMENSIONS; i++) {
node_region[i] = -Float.MAX_VALUE;
node_region[i+tree.DIMENSIONS] = Float.MAX_VALUE;
}
_get_range(tree, result, tree.root, node_region, 0, range);
}
The node_region represents the region of the node, we start as large as possible. Cause for all we know this could be the region we are dealing with.
Here the recursive _get_range implementation:
static public <T> void _get_range(K_D_Tree<T> tree, List<T> result, K_D_Tree_Node<T> node, float[] node_region, int dimension, float[] target_region) {
if (dimension == tree.DIMENSIONS) dimension = 0;
if (_contains_region(tree, node_region, target_region)) {
_add_whole_branch(node, result);
}
else {
float value = _value(tree, dimension, node);
if (node.left != null) {
float[] node_region_left = new float[tree.DIMENSIONS*2];
System.arraycopy(node_region, 0, node_region_left, 0, node_region.length);
node_region_left[dimension + tree.DIMENSIONS] = value;
if (_intersects_region(tree, node_region_left, target_region)){
_get_range(tree, result, node.left, node_region_left, dimension+1, target_region);
}
}
if (node.right != null) {
float[] node_region_right = new float[tree.DIMENSIONS*2];
System.arraycopy(node_region, 0, node_region_right, 0, node_region.length);
node_region_right[dimension] = value;
if (_intersects_region(tree, node_region_right, target_region)){
_get_range(tree, result, node.right, node_region_right, dimension+1, target_region);
}
}
if (_region_contains_node(tree, target_region, node)) {
result.add(node.point);
}
}
}
One important thing that the other answer does not provide is this part:
if (_contains_region(tree, node_region, target_region)) {
_add_whole_branch(node, result);
}
With a range search for a KD-Tree you have 3 options for a node's region, it's:
fully outside
it intersects
it's fully contained
Once you know a region is fully contained, then you can add the whole branch without doing any dimension checks.
To make it more clear, here is the _add_whole_branch:
static public <T> void _add_whole_branch(K_D_Tree_Node<T> node, List<T> result) {
result.add(node.point);
if (node.left != null) _add_whole_branch(node.left, result);
if (node.right != null) _add_whole_branch(node.right, result);
}
In this image, all the big white dots where added using _add_whole_branch and only for the red dots a check for all dimensions had to be done.
Optimization
1)
Instead of starting with the root node for the _get_range function, instead you can find the split node. This is the first node that has it's point within the query range. To find the split node you will still need to start at the root node, but the calculations are a bit cheaper (cause you go either left or right till).
2)
Now I create the float[] node_region_left and float[] node_region_right, and since this happens in a recursive function it can lead to quite some arrays. However, you can reuse the one for the left for the right. I didn't do it in this example for clarity reasons.
I can also imagine storing the region size in the node, but this takes quite some more memory and might lead to a lot of cache misses.

Related

Find all the nodes which can be reached from a given node within certain distance

Multiple nodes can also be traversed in sequence to create a path.
struct Node {
float pos [2];
bool visited = false;
};
// Search Space
Node nodes [MAX_NODE_COUNT];
// Function to return all reachable nodes
// nodeCount : Number of nodes
// nodes : Array of nodes
// indexCount : Size of the returned array
// Return : Returns array of all reachable node indices
int* GetReachableNodes (int nodeCount, Node* nodes, int* indexCount)
{
// This is the naive approach
queue <int> indices;
vector <int> indexList;
indices.push (nodes [0]);
nodes [0].visited = true;
// Loop while queue is not empty
while (!indices.empty ())
{
// Pop the front of the queue
int parent = indices.front ();
indices.pop ();
// Loop through all children
for (int i = 0; i < nodeCount; ++i)
{
if (!nodes [i].visited && i != parent && DistanceSqr (nodes [i], nodes [parent]) < maxDistanceSqr)
{
indices.push (i);
nodes [i].visited = true;
indexList.push_back (i);
}
}
}
int* returnData = new int [indexList.size ()];
std::move (indexList.begin (), indexList.end (), returnData);
*indexCount = indexList.size ();
// Caller responsible for delete
return returnData;
}
The problem is I can't use a graph since all nodes are connected to each other.
All the data is just an array of nodes with their positions and the start node.
I can solve this problem easily with BFS but it's n^2.
I have been figuring out how to solve this in less than O(n^2) time complexity.
Some of the approaches I tried were parallel BFS, Dijkstra but it doesn't help a lot.
Any help will be really appreciated.

Type conversion on multidimensional map iterator

struggling a bit with getting ahold of map keys. I am wanting this method to modify a vector of edges where the index is the dest node and the value at that index is the weight. The vector "edges" is already initialized large enough to hold the edges. The compiler is complaining that it cannot convert iterator type to int. Anyone have any advice or thoughts on how to handle this conversion if it is even possible? Thanks. This is part of an assignment I am working on implementing Dijkstra's shortest path for a MOOC.
void CompleteGraph::GetNodeEdges(int Node, std::vector<int> &edges){
// Modifies a vector of edges given the source node sorted by edge weight
// Iterate over a map. Grab all edges that have the given start node(map's 1st key)
typedef std::map<int, std::map<int, int> >::iterator iter;
for(iter i = Graph.begin(); i != Graph.end(); i++){
if (i->first == Node){
edges[i->second.begin()] = i->second.end();
}
}
// Sort vector here: will do this next
}
Your problem is obviously on this line:
edges[i->second.begin()] = i->second.end();
The type of the key of edges is int, but i->second.begin() is returning an iterator, because i->second returns a map. I guess you need something like:
edges[i->second.begin()->first] = i->second.end();
Depending on what information from Graph you want to use, as you haven't told us about what the Graph represents.
I suspect you are looking for something like this:
iter i = Graph.find(Node);
if (i != Graph.end()) {
map<int, int>& edges_map = i->second;
for (map<int, int>::iterator m = edges_map.begin();
m != edges_map.end(); ++m) {
if (edges.size() <= m->first) {
edges.resize(m->first + 1);
}
edges[m->first] = m->second;
}
}

KD TREES (3-D) Nearest Neighbour Search

I am looking at the Wikipedia page for KD trees Nearest Neighbor Search.
The pseudo code given in Wikipedia works when the points are in 2-D(x,y) .
I want to know,what changes should i make,when the points are 3-D(x,y,z).
I googled a lot and even went through similar questions link in stack overflow ,but i did n't find the 3-d implementation any where,all previous question takes 2-D points as input ,not the 3-D points that i am looking for.
The pseudo code in Wiki for building the KD Tree is::
function kdtree (list of points pointList, int depth)
{
// Select axis based on depth so that axis cycles through all valid values
var int axis := depth mod k;
// Sort point list and choose median as pivot element
select median by axis from pointList;
// Create node and construct subtrees
var tree_node node;
node.location := median;
node.leftChild := kdtree(points in pointList before median, depth+1);
node.rightChild := kdtree(points in pointList after median, depth+1);
return node;
}
How to find the Nearest neighbor now after building the KD Trees?
Thanks!
You find the nearest neighbour exactly as described on the Wikipedia page under the heading "Nearest neighbour search". The description there applies in any number of dimensions. That is:
Go down the tree recursively from the root as if you're about to insert the point you're looking for the nearest neighbour of.
When you reach a leaf, note it as best-so-far.
On the way up the tree again, for each node as you meet it:
If it's closer than the best-so-far, update the best-so-far.
If the distance from best-so-far to the target point is greater than the distance from the target point to the splitting hyperplane at this node,
process the other child of the node too (using the same recursion).
I've recently coded up a KDTree for nearest neighbor search in 3-D space and ran into the same problems understand the NNS, particularly 3.2 of the wiki. I ended up using this algorithm which seems to work in all my tests:
Here is the initial leaf search:
public Collection<T> nearestNeighbourSearch(int K, T value) {
if (value==null) return null;
//Map used for results
TreeSet<KdNode> results = new TreeSet<KdNode>(new EuclideanComparator(value));
//Find the closest leaf node
KdNode prev = null;
KdNode node = root;
while (node!=null) {
if (KdNode.compareTo(node.depth, node.k, node.id, value)<0) {
//Greater
prev = node;
node = node.greater;
} else {
//Lesser
prev = node;
node = node.lesser;
}
}
KdNode leaf = prev;
if (leaf!=null) {
//Used to not re-examine nodes
Set<KdNode> examined = new HashSet<KdNode>();
//Go up the tree, looking for better solutions
node = leaf;
while (node!=null) {
//Search node
searchNode(value,node,K,results,examined);
node = node.parent;
}
}
//Load up the collection of the results
Collection<T> collection = new ArrayList<T>(K);
for (KdNode kdNode : results) {
collection.add((T)kdNode.id);
}
return collection;
}
Here is the recursive search which starts at the closest leaf node:
private static final <T extends KdTree.XYZPoint> void searchNode(T value, KdNode node, int K, TreeSet<KdNode> results, Set<KdNode> examined) {
examined.add(node);
//Search node
KdNode lastNode = null;
Double lastDistance = Double.MAX_VALUE;
if (results.size()>0) {
lastNode = results.last();
lastDistance = lastNode.id.euclideanDistance(value);
}
Double nodeDistance = node.id.euclideanDistance(value);
if (nodeDistance.compareTo(lastDistance)<0) {
if (results.size()==K && lastNode!=null) results.remove(lastNode);
results.add(node);
} else if (nodeDistance.equals(lastDistance)) {
results.add(node);
} else if (results.size()<K) {
results.add(node);
}
lastNode = results.last();
lastDistance = lastNode.id.euclideanDistance(value);
int axis = node.depth % node.k;
KdNode lesser = node.lesser;
KdNode greater = node.greater;
//Search children branches, if axis aligned distance is less than current distance
if (lesser!=null && !examined.contains(lesser)) {
examined.add(lesser);
double nodePoint = Double.MIN_VALUE;
double valuePlusDistance = Double.MIN_VALUE;
if (axis==X_AXIS) {
nodePoint = node.id.x;
valuePlusDistance = value.x-lastDistance;
} else if (axis==Y_AXIS) {
nodePoint = node.id.y;
valuePlusDistance = value.y-lastDistance;
} else {
nodePoint = node.id.z;
valuePlusDistance = value.z-lastDistance;
}
boolean lineIntersectsCube = ((valuePlusDistance<=nodePoint)?true:false);
//Continue down lesser branch
if (lineIntersectsCube) searchNode(value,lesser,K,results,examined);
}
if (greater!=null && !examined.contains(greater)) {
examined.add(greater);
double nodePoint = Double.MIN_VALUE;
double valuePlusDistance = Double.MIN_VALUE;
if (axis==X_AXIS) {
nodePoint = node.id.x;
valuePlusDistance = value.x+lastDistance;
} else if (axis==Y_AXIS) {
nodePoint = node.id.y;
valuePlusDistance = value.y+lastDistance;
} else {
nodePoint = node.id.z;
valuePlusDistance = value.z+lastDistance;
}
boolean lineIntersectsCube = ((valuePlusDistance>=nodePoint)?true:false);
//Continue down greater branch
if (lineIntersectsCube) searchNode(value,greater,K,results,examined);
}
}
The full java source can be found here.
I want to know,what changes should i make,when the points are
3-D(x,y,z).
You get the current axis on this line
var int axis := depth mod k;
Now depending on the axis, you find the median by comparing the corresponding property. Eg. if axis = 0 you compare against the x property. One way to implement this is to pass a comparator function in the routine that does the search.

Writing a program to check if a graph is bipartite

I need to write a program that check if a graph is bipartite.
I have read through wikipedia articles about graph coloring and bipartite graph. These two article suggest methods to test bipartiteness like BFS search, but I cannot write a program implementing these methods.
Why can't you? Your question makes it hard for someone to even write the program for you since you don't even mention a specific language...
The idea is to start by placing a random node into a FIFO queue (also here). Color it blue. Then repeat this while there are nodes still left in the queue: dequeue an element. Color its neighbors with a different color than the extracted element and insert (enqueue) each neighbour into the FIFO queue. For example, if you dequeue (extract) an element (node) colored red, color its neighbours blue. If you extract a blue node, color its neighbours red. If there are no coloring conflicts, the graph is bipartite. If you end up coloring a node with two different colors, than it's not bipartite.
Like #Moron said, what I described will only work for connected graphs. However, you can apply the same algorithm on each connected component to make it work for any graph.
http://www.personal.kent.edu/~rmuhamma/Algorithms/MyAlgorithms/GraphAlgor/breadthSearch.htm
Please read this web page, using breadth first search to check when you find a node has been visited, check the current cycle is odd or even.
A graph is bipartite if and only if it does not contain an odd cycle.
The detailed implementation is as follows (C++ version):
struct NODE
{
int color;
vector<int> neigh_list;
};
bool checkAllNodesVisited(NODE *graph, int numNodes, int & index);
bool checkBigraph(NODE * graph, int numNodes)
{
int start = 0;
do
{
queue<int> Myqueue;
Myqueue.push(start);
graph[start].color = 0;
while(!Myqueue.empty())
{
int gid = Myqueue.front();
for(int i=0; i<graph[gid].neigh_list.size(); i++)
{
int neighid = graph[gid].neigh_list[i];
if(graph[neighid].color == -1)
{
graph[neighid].color = (graph[gid].color+1)%2; // assign to another group
Myqueue.push(neighid);
}
else
{
if(graph[neighid].color == graph[gid].color) // touble pair in the same group
return false;
}
}
Myqueue.pop();
}
} while (!checkAllNodesVisited(graph, numNodes, start)); // make sure all nodes visited
// to be able to handle several separated graphs, IMPORTANT!!!
return true;
}
bool checkAllNodesVisited(NODE *graph, int numNodes, int & index)
{
for (int i=0; i<numNodes; i++)
{
if (graph[i].color == -1)
{
index = i;
return false;
}
}
return true;
}

algorithm to use to return a specific range of nodes in a directed graph

I have a class Graph with two lists types namely nodes and edges
I have a function
List<int> GetNodesInRange(Graph graph, int Range)
when I get these parameters I need an algorithm that will go through the graph and return the list of nodes only as deep (the level) as the range.
The algorithm should be able to accommodate large number of nodes and large ranges.
Atop this, should I use a similar function
List<int> GetNodesInRange(Graph graph, int Range, int selected)
I want to be able to search outwards from it, to the number of nodes outwards (range) specified.
alt text http://www.freeimagehosting.net/uploads/b110ccba58.png
So in the first function, should I pass the nodes and require a range of say 2, I expect the results to return the nodes shown in the blue box.
The other function, if I pass the nodes as in the graph with a range of 1 and it starts at node 5, I want it to return the list of nodes that satisfy this criteria (placed in the orange box)
What you need seems to be simply a depth-limited breadth-first search or depth-first search, with an option of ignoring edge directionality.
Here's a recursive definition that may help you:
I'm the only one of range 1 from myself.
I know who my immediate neighbors are.
If N > 1, then those of range N from myself are
The union of all that is of range N-1 from my neighbors
It should be a recursive function, that finds neighbours of the selected, then finds neighbours of each neighbour until range is 0. DFS search something like that:
List<int> GetNodesInRange(Graph graph, int Range, int selected){
var result = new List<int>();
result.Add( selected );
if (Range > 0){
foreach ( int neighbour in GetNeighbours( graph, selected ) ){
result.AddRange( GetNodesInRange(graph, Range-1, neighbour) );
}
}
return result;
}
You should also check for cycles, if they are possible. This code is for tree structure.
// get all the nodes that are within Range distance of the root node of graph
Set<int> GetNodesInRange(Graph graph, int Range)
{
Set<int> out = new Set<int>();
GetNodesInRange(graph.root, int Range, out);
return out;
}
// get all the nodes that are within Range successor distance of node
// accepted nodes are placed in out
void GetNodesInRange(Node node, int Range, Set<int> out)
{
boolean alreadyVisited = out.add(node.value);
if (alreadyVisited) return;
if (Range == 0) return;
// for each successor node
{
GetNodesInRange(successor, Range-1, out);
}
}
// get all the nodes that are within Range distance of selected node in graph
Set<int> GetNodesInRange(Graph graph, int Range, int selected)
{
Set<int> out = new Set<int>();
GetNodesInRange(graph, Range, selected, out);
return out;
}
// get all the nodes that are successors of node and within Range distance
// of selected node
// accepted nodes are placed in out
// returns distance to selected node
int GetNodesInRange(Node node, int Range, int selected, Set<int> out)
{
if (node.value == selected)
{
GetNodesInRange(node, Range-1, out);
return 1;
}
else
{
int shortestDistance = Range + 1;
// for each successor node
{
int distance = GetNodesInRange(successor, Range, selected, out);
if (distance < shortestDistance) shortestDistance = distance;
}
if (shortestDistance <= Range)
{
out.add(node.value);
}
return shortestDistance + 1;
}
}
I modified your requirements somewhat to return a Set rather than a List.
The GetNodesInRange(Graph, int, int) method will not handle graphs that contain cycles. This can be overcome by maintaining a collection of nodes that have already been visited. The GetNodesInRange(Graph, int) method makes use of the fact that the out set is a collection of visited nodes to overcome cycles.
Note: This has not been tested in any way.

Resources