Should a kd tree be balanced like a binary search tree?

Should a kd tree be balanced like a binary search tree? - data-structures

I tried coding a kd tree and a binary search tree. I'm wondering why the bst gives me stack overflow error when inserting unbalanced data while the kd tree does not.
code for the binary search tree:
void insert(node *&_node, int _val) { //bst insert function, will change raw pointers to smart pointers later
if (_node == NULL) {
_node = new node;
_node->val = _val;
}
else if (_node->val > _val)
insert(_node->left, _val);
else if (_node->val < _val)
insert(_node->right, _val);
}
for (int i = 0; i < 4000;i++) { //inserting unbalanced data, gives me stack overflow error
test.insert(test.root, i);
}
code for the kd-tree:
void insert(Nodeptr &node, int point[], int depth) { //kd tree insert function, Nodeptr is a typedef of unique_ptr<Node>
if (node == NULL) {
node = Nodeptr(new Node(point));
node->axis = depth%dim;
}
else if (node->pt[depth%dim] > point[depth%dim])
insert(node->left, point, depth+1);
else if (node->pt[depth%dim] < point[depth%dim])
insert(node->right, point, depth+1);
}
for (int i = 0; i < 5000; i++) { // inserting unbalanced data, does not give me stack overflow error
point[0] = i;
point[1] = i;
tree.insert(tree.root, point, 0);
}
edit: visual studio says that may heap size is 224.57kb and created 3746 nodes when exception occured, so it didnt really complete 4000 insertions

Related

How to derive the proof of this formula for getting right child for a binary tree given inorder and preorder traversals?

I'm looking at this question on leetcode. Given two arrays, inorder and preorder, you need to construct a binary tree. I get the general solution of the question.
Preorder traversal visits root, left, and right, so the left child would be current preorder node index + 1. From that value, you can then know how many nodes are on the left of the tree using the inorder array. In the answers, the formula used to get the right child is "preStart + inIndex - inStart + 1".
I don't want to memorize the formula so I'm wondering if there is a proof for this? I went through the discussion board there, but I'm still missing a link.

For Python Only
In Python we can also use pop(0) for solving this problem, even though that's inefficient (it would pass though).
For inefficiency we can likely use deque() with popleft(), however not on LeetCode, because we don't have control over the tree.
class Solution:
def buildTree(self, preorder, inorder):
if inorder:
index = inorder.index(preorder.pop(0))
root = TreeNode(inorder[index])
root.left = self.buildTree(preorder, inorder[:index])
root.right = self.buildTree(preorder, inorder[index + 1:])
return root
For Java and C++, that'd be a bit different just like you said (don't have the proof) but maybe this post would be just a bit helpful:
public class Solution {
public static final TreeNode buildTree(
final int[] preorder,
final int[] inorder
) {
return traverse(0, 0, inorder.length - 1, preorder, inorder);
}
private static final TreeNode traverse(
final int preStart,
final int inStart,
final int atEnd,
final int[] preorder,
final int[] inorder
) {
if (preStart > preorder.length - 1 || inStart > atEnd) {
return null;
}
TreeNode root = new TreeNode(preorder[preStart]);
int inorderIndex = 0;
for (int i = inStart; i <= atEnd; i++)
if (inorder[i] == root.val) {
inorderIndex = i;
}
root.left = traverse(preStart + 1, inStart, inorderIndex - 1, preorder, inorder);
root.right = traverse(preStart + inorderIndex - inStart + 1, inorderIndex + 1, atEnd, preorder, inorder);
return root;
}
}
C++
// The following block might slightly improve the execution time;
// Can be removed;
static const auto __optimize__ = []() {
std::ios::sync_with_stdio(false);
std::cin.tie(nullptr);
std::cout.tie(nullptr);
return 0;
}();
// Most of headers are already included;
// Can be removed;
#include <cstdint>
#include <vector>
#include <unordered_map>
using ValueType = int;
static const struct Solution {
TreeNode* buildTree(
std::vector<ValueType>& preorder,
std::vector<ValueType>& inorder
) {
std::unordered_map<ValueType, ValueType> inorder_indices;
for (ValueType index = 0; index < std::size(inorder); ++index) {
inorder_indices[inorder[index]] = index;
}
return build(preorder, inorder, inorder_indices, 0, 0, std::size(inorder) - 1);
}
private:
TreeNode* build(
std::vector<ValueType>& preorder,
std::vector<ValueType>& inorder,
std::unordered_map<ValueType, ValueType>& inorder_indices,
ValueType pre_start,
ValueType in_start,
ValueType in_end
) {
if (pre_start >= std::size(preorder) || in_start > in_end) {
return nullptr;
}
TreeNode* root = new TreeNode(preorder[pre_start]);
ValueType pre_index = inorder_indices[preorder[pre_start]];
root->left = build(preorder, inorder, inorder_indices, pre_start + 1, in_start, pre_index - 1);
root->right = build(preorder, inorder, inorder_indices, pre_start + 1 + pre_index - in_start, pre_index + 1, in_end);
return root;
}
};

Using Instance Variables vs Function Arguments in Recursion

Is there any difference, efficiency-wise, in using instance variables vs passing arguments by function calls during recursion? For example, I was recently doing a problem on Leetcode which asked to:
Given a Binary Search Tree (BST), convert it to a Greater Tree such that every key of the original BST is changed to the original key plus sum of all keys greater than the original key in BST.
My solution and the most popular one by far is as follows: 46 ms according to Leetcode
class Solution {
public:
int sum = 0;
TreeNode* convertBST(TreeNode* root) {
if (root == NULL) return 0;
convertBST(root->right);
sum += root->val;
root->val = sum;
convertBST(root->left);
return root;
}
};
But why couldn't we also use the following solution, or does it matter? 59 ms runtime according to Leetcode
class Solution {
public:
TreeNode* convertBST(TreeNode* root, int* sum) {
if (root == NULL) return NULL;
convertBST(root->right, sum);
*sum += root->val;
root->val = *sum;
convertBST(root->left, sum);
return root;
}
TreeNode* convertBST(TreeNode* root) {
int sum = 0;
return convertBST(root, & sum);
}
};
Thanks

How to construct a binary tree using a level order traversal sequence

How to construct a binary tree using a level order traversal sequence, for example from sequence {1,2,3,#,#,4,#,#,5}, we can construct a binary tree like this:
1
/ \
2 3
/
4
\
5
where '#' signifies a path terminator where no node exists below.
Finally I implement Pham Trung's algorithm by c++
struct TreeNode
{
TreeNode *left;
TreeNode *right;
int val;
TreeNode(int x): left(NULL), right(NULL), val(x) {}
};
TreeNode *build_tree(char nodes[], int n)
{
TreeNode *root = new TreeNode(nodes[0] - '0');
queue<TreeNode*> q;
bool is_left = true;
TreeNode *cur = NULL;
q.push(root);
for (int i = 1; i < n; i++) {
TreeNode *node = NULL;
if (nodes[i] != '#') {
node = new TreeNode(nodes[i] - '0');
q.push(node);
}
if (is_left) {
cur = q.front();
q.pop();
cur->left = node;
is_left = false;
} else {
cur->right = node;
is_left = true;
}
}
return root;
}

Assume using array int[]data with 0-based index, we have a simple function to get children:
Left child
int getLeftChild(int index){
if(index*2 + 1 >= data.length)
return -1;// -1 Means out of bound
return data[(index*2) + 1];
}
Right child
int getRightChild(int index){
if(index*2 + 2 >= data.length)
return -1;// -1 Means out of bound
return data[(index*2) + 2];
}
Edit:
Ok, so by maintaining a queue, we can build this binary tree.
We use a queue to maintain those nodes that are not yet processed.
Using a variable count to keep track of the number of children added for the current node.
First, create a root node, assign it as the current node.
So starting from index 1 (index 0 is the root), as the count is 0, we add this node as left child of the current node.
Increase count. If this node is not '#', add it to the queue.
Moving to the next index, the count is 1, so we add this as right child of current node, reset count to 0 and update current node (by assigning the current node as the first element in the queue). If this node is not '#', add it to the queue.
int count = 0;
Queue q = new Queue();
q.add(new Node(data[0]);
Node cur = null;
for(int i = 1; i < data.length; i++){
Node node = new Node(data[i]);
if(count == 0){
cur = q.dequeue();
}
if(count==0){
count++;
cur.leftChild = node;
}else {
count = 0;
cur.rightChild = node;
}
if(data[i] != '#'){
q.enqueue(node);
}
}
class Node{
int data;
Node leftChild, rightChild;
}
Note: this should only work for a binary tree and not BST.

we can build this binary tree from level order traversal by maintaining a queue. Queue is used to maintain those nodes that are not yet processed.
Using a variable count(index variable) to keep track of the number of children added for the current node.
First, create a root node, assign it as the current node. So starting from index 1,
index value is 1 means, we will add the next value as left node.
index value is 2 means we will add the next value as right node and index value 2 means that we have added left and right node, then do the same for the remaining nodes.
if arr value is -1
3.a. if index value is 1,i.e., there is no left node then change the index variable to add right node.
3.b. if index value is 2, i.e, there is no right node then we have repeat this step for the remaining.
static class Node{
int data;
Node left;
Node right;
Node(int d){
data=d;
left=null;
right=null;
}
}
public static Node constBT(int arr[],int n){
Node root=null;
Node curr=null;
int index=0;
Queue<Node> q=new LinkedList<>();
for(int i=0;i<n;i++){
if(root==null){
root=new Node(arr[i]);
q.add(root);
curr=q.peek();
index=1;
}else{
if(arr[i]==-1){
if(index==1)
index=2;
else{
q.remove();
curr=q.peek();
index=1;
}
}
else if(index==1){
curr.left=new Node(arr[i]);
q.add(curr.left);
index=2;
}else if(index==2){
curr.right=new Node(arr[i]);
q.add(curr.right);
q.remove();
curr=q.peek();
index=1;
}
}
}
return root;
}

My approach is similar to Pham Trung yet intutive. We would maintain an array of Nodes of given data instead of using a queue. We would do reverse engineering on BFS using queue. because BFS for a tree is basically its Level Order Traversal (LOT).
It is important to note that we should have the NULL childs of an node for the LOT to be unique and the reconstruction of Tree from LOT to be possible.
In this case LOT : 1,2,3,-1,-1,4,-1,-1,5
where I have used -1 instead of '#' to represent NULLs
And Tree is
1
/ \
2 3
/ \ /
-1 -1 4
/ \
-1 5
Here, we can easily see that when 1 is popped from the BFS queue, it pushed its left child
(2) and right child (3) in the queue. Similary, for 2 it pushed -1 (NULL) for both of its children. And the process is continued.
So, we can follow the following pseudo code to generate the tree rooted at LOT[0]
j = 1
For every node in LOT:
if n<=j: break
if node != NULL:
make LOT[j] left child of node
if n<=j+1: break
make LOT[j+1] right child of node
j <- j+2
Finally, C++ code for the same
Class Declaration and Preorder traversal
class Node{
public:
int val;
Node* lft, *rgt;
Node(int x ):val(x) {lft=rgt=nullptr;}
};
void preorder(Node* root) {
if(!root) return;
cout<<root->val<<" ";
preorder(root->lft);
preorder(root->rgt);
}
Restoring Tree from LOT Logic
int main(){
int arr[] = {1,2,3,-1,-1,4,-1,-1,5};
int n = sizeof(arr)/sizeof(int);
Node* brr[n];
for(int i=0;i<n;i++) {
if(arr[i]==-1) brr[i] = nullptr;
else brr[i] = new Node(arr[i]);
}
for(int i=0,j=1;j<n;i++) {
if(!brr[i]) continue;
brr[i]->lft = brr[j++];
if(j<n) brr[i]->rgt = brr[j++];
}
preorder(brr[0]);
}
Output: 1 2 3 4 5

How to modify a recursion to a loop version in this case?

I want to modify the c++ code below, to use loop instead of recursion.
I know of 2 ways to modify it:
Learn from the code and make a loop algorithm. In this case I think the meaning of code is to printB (except leaf) and printA (expect root) by level order. For a binary (search) tree, how can I traverse it from leaf to root in a loop (without a pointer to parent)?
Use a stack to imitate the process on the stack. In the case, I can't make it, can you help me and say some useful thinking?
void func(const Node& node) {
if (ShouldReturn(node)) {
return;
}
for (int i = 0; i < node.children_size(); ++i) {
const Node& child_node = node.child(i);
func(child_node);
PrintA();
}
PrintB();
}

Assuming you are using C++
For the stack part, lets say, the code does the following.
If Node was leaf, nothing.
Else do the same for each child,then printA after each.
then printB.
So what if I adjusted the code alittle. The adjustments only to fit for iterative way.
void func(const Node& node) {
if(ShouldReturn(node)) return;
PrintB();
for(int i = 0; i < node.children_size(); ++i) {
printA();
const Node& child_node = node.child(i);
func(child_node, false);
}
}
// This way should make it print As & Bs in reverse direction.
// Lets re-adjust the code even further.
void func(const Node& node, bool firstCall = true) {
if(!firstCall) printA; //Placed that here, as printA is always called if a new Node is called, but not for the root Node, that's why I added the firstCall.
if(ShouldReturn(node)) return;
PrintB();
for(int i = 0; i < node.children_size(); ++i) {
const Node& child_node = node.child(i);
func(child_node, false);
}
}
That should reverse the order of printing A & B, I hope I'm not wrong :D
So, now I want to have 2 vectors.
// Lets define an enum
typedef enum{fprintA, fprintB} printType;
void func(const Node& node){
vector<printType> stackOfPrints;
vector<Node*> stackOfNodes; stackOfNodes.push_back(node);
bool first = true; //As we don't need to printA before the root.
while ((int)stackOfNodes.size() > 0){
const Node& fNode = stackOfNodes.back();
stackOfNodes.pop_back();
if (!first) stackOfPrints.push_back(fprintA); // If not root printA.
first = false;
if(ShouldReturn(fNode)) continue;
stackOfPrints.push_back(fprintB);
// here pushing the Nodes in a reverse order so that to be processed in the stack in the correct order.
for(int i = (int)fNode.children_size() - 1; i >= 0; --i){
stackOfNodes.push_back(fNode.child(i));
}
}
// Printing the stackOfPrints in reverse order (remember we changed the code, to initially print As & Bs in reverse direction)
// this way, it will make the function print them in the correct required order
while((int)stackOfPrints.size() > 0){
switch(stackOfPrints.back()){
case fprintA: printA(); break;
case fprintB: printB(); break;
default: break;
};
stackOfPrints.pop_back();
}
}
Let's hope I write the code correctly. :) I hope it helps.

How to improve recursive backtracking algorithm

I implemented backtracking based solution for my problem which I specified in my previous post: Packing items into fixed number of bins
(Bin is a simple wrapper for vector<int> datatype with additional methods such as sum() )
bool backtrack(vector<int>& items, vector<Bin>& bins, unsigned index, unsigned bin_capacity)
{
if (bin_capacity - items.front() < 0) return false;
if (index < items.size())
{
//try to put an item into all opened bins
for(unsigned i = 0; i < bins.size(); ++i)
{
if (bins[i].sum() + items[index] + items.back() <= bin_capacity || bin_capacity - bins[i].sum() == items[index])
{
bins[i].add(items[index]);
return backtrack(items, bins, index + 1, bin_capacity);
}
}
//put an item without exceeding maximum number of bins
if (bins.size() < BINS)
{
Bin new_bin = Bin();
bins.push_back(new_bin);
bins.back().add(items[index]);
return backtrack(items, bins, index + 1, bin_capacity);
}
}
else
{
//check if solution has been found
if (bins.size() == BINS )
{
for (unsigned i = 0; i <bins.size(); ++i)
{
packed_items.push_back(bins[i]);
}
return true;
}
}
return false;
}
Although this algorithm works quite fast, it's prone to stack overflow for large data sets.
I'm looking for any ideas and suggestions how to improve it.
Edit:
I decided to try an iterative approach with explicit stack, but my solution doesn't work as expeced - sometimes it gives incorrect results.
bool backtrack(vector<int>& items, vector<Bin>& bins, unsigned index, unsigned bin_capacity)
{
stack<Node> stack;
Node node, child_node;
Bin new_bin;
//init the stack
node.bins.add(new_bin);
node.bins.back().add(items[item_index]);
stack.push(node);
item_index++;
while(!stack.empty())
{
node = stack.top();
stack.pop();
if (item_index < items.size())
{
if (node.bins.size() < BINS)
{
child_node = node;
Bin empty;
child_node.bins.add(empty);
child_node.bins.back().add(items[item_index]);
stack.push(child_node);
}
int last_index = node.bins.size() - 1;
for (unsigned i = 0; i < node.bins.size(); i++)
{
if (node.bins[last_index - i]->get_sum() + items[item_index]+ items.back() <= bin_capacity ||
bin_capacity - node.bins[last_index - i]->get_sum() == items[item_index])
{
child_node = node;
child_node.bins[last_index - i]->push_back(items[item_index]);
stack.push(child_node);
}
}
item_index++;
}
else
{
if (node.bins() == BINS)
{
//copy solution
bins = node.bins;
return true;
}
}
}
return false;
}
Any suggestions are highly appreciated.

I think there's a dynamic programming algorithm for solving the multiple-bin packing problem, or at least, a polynomial approximation algorithm. Take a look here and here.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Should a kd tree be balanced like a binary search tree? - data-structures

Related

How to derive the proof of this formula for getting right child for a binary tree given inorder and preorder traversals?

Using Instance Variables vs Function Arguments in Recursion

How to construct a binary tree using a level order traversal sequence

How to modify a recursion to a loop version in this case?

How to improve recursive backtracking algorithm

Categories

Resources