How weigth order affects the computing cost in a backtracking algorithm? The number of nodes and search trees are the same but when it's non-ordered it tooks a more time, so it's doing something.
Thanks!
Sometimes in backtracking algorithms, when you know a certain branch is not an answer - you can trim it. This is very common with agents for games, and is called Alpha Beta Prunning.
Thus - when you reorder the visited nodes, you can increase your prunning rate and thus decrease the actual number of nodes you visit, without affecting the correctness of your answer.
One more possibility - if there is no prunning is cache performance. Sometimes trees are stored as array [especially complete trees]. Arrays are most efficient when iterating, and not "jumping randomly". The reorder might change this behavior, resulting in better/worse cache behavior.
The essence of backtracking is precisely not looking at all possibilities or nodes (in this case), however, if the nodes are not ordered it is impossible for the algorithm to "prune" a possible branch because it is not known with certainty if the element Is actually on that branch.
Unlike when it is an ordered tree since if the searched element is greater / smaller the root of that subtree, the searched element is to the right or left respectively. That is why if the tree is not ordered the computational order is equal to brute force, however, if the tree is ordered in the worst case order is equivalent to brute force, but the order of execution is smaller.
Related
I recently came across D.S.U. and its applications on the tree.As i was solving the related problems, I got Time Limit Exceeded error in some so i read the tutorial again and there I found that an improvised version of the normal union is weighted-union. In this weighted union operation, we make the smaller sized subset's root as child of larger sized subset's(among the two) root. How is it benefiting us?
Link to Tutorial
You should realise the purpose/logic behind weighted union-find.
First, why do we need weighted union-find? That's because a simple inefficient union-find can lead to an unbalanced tree. In the worst cast a linked list. What's the complexity of traversal over a linked list? O(N). That's the worst complexity when using a simple union-find.
Our goal is - balancing the hence-formed tree.
How and why weighted union-find works? It's a simple optimization by just keeping the size of each subset and making the smaller subset child of the larger subset when performing union between the two.
Why this works? Because, as mentioned, our goal is to balance out the tree when doing the union and not unbalance it. If you make the smaller subset a child of the larger subset, the height of the overall tree is not increasing (obv when the sizes are equal we handle it differently :/). On the other hand, if you make the bigger subset a child of the smaller tree, you know what will happen.
Using just this optimization we improve the worst case time complexity from O(N) to O(log2(N)) - because the height of the tree will never go beyond log2(N)
There's another optimization that can be done along with this which will take the complexity down even further. Your link probably has it.
Doesn't make difference in correctness' point of view, but it is usually faster.
Check this example:
In the first case, you put the biggest set as child of the smallest. You can see that in this case, if you try the find method in the deepest node, it will perform 3 steps. This doesn't happen in the second case.
This is not a rule but pratically it's what happens.
I recently had an interview for a position dealing with extremely large distributed systems, and one of the questions I was asked was to make a function that could count the nodes in a binary tree entirely in place; meaning no recursion, and no queue or stack for an iterative approach.
I don't think I have ever seen a solution that does not use at least one of the above, either when I was in school or after.
I mentioned that having a "parent" pointer would trivialize the problem somewhat but adding even a single simple field to each node in a tree with a million nodes is not trivial in terms of memory cost.
How can this be done?
If an exact solution is required, then the prerequisite of being a binary tree may be a red herring. Each node in the cluster may simply count allocations in the backing collection. Which may be either constant or linear time, depending on whether it has been tracked or not.
If no exact solution was asked for, but the given tree is balanced, then a simple deep probe to determine tree hight, in combination with the placement rules allows to estimate an upper and lower bound for the total node count. Be wary that the probe may have either hit a node with height log2(n) or log2(n) - 1, so your estimate can be up to factor 2 too low or too high. Constant space, O(log(n)) time.
If the placement rules dictate special properties about the bottom most layer (e.g. filled from left to right, not e.g. a red-black-tree), then you may perform log(n) probes in a binary search pattern to find the exact count, in constant space and O(log(n)^2) time.
I am trying to find a reasonable algorithm find the first tree pattern matching in unordered, rooted trees. According to some research I have come across, this problem is NP-Complete. I don't need to find every pattern match, I just need to find any pattern matching that exists. Preferably, I would rather not have to perform "deletions" on my tree (nor do I want to make a copy to delete nodes from).
Another thing to note is that the tree will be updated between tree matching queries, so I'm also hoping that there may be some algorithms that take advantage of this fact, possibly using an online approach that keeps track of previous partial matches in the tree to optimize a future match.
Is there a straightforward algorithm that can solve this problem given the criteria I mentioned, but one that is still better than the pure naive brute force approach?
Notes, my problem is similar to this previously asked question, but that question is specific to ordered trees.
According to http://www.sciencedirect.com/science/article/pii/S1570866704000644 the problem that is NP-complete is tree inclusion. That means that the tree can fit in potentially skipping generations. So, for instance, a tree with one root and 1000 leaves could fit into a tree which branches in 2 10x. And because this problem is NP-complete, you cannot do fundamentally better exponential growth as the trees grow.
But you can reduce that exponent and do much better than brute force. For example for each node in the tree record the maximum depth below it and total number of descendants. As you try to fit one tree into the other, stop searching whenever you're trying to fit a subtree with too much depth or too many children. This will let you avoid following a lot of lost causes.
You can also use dynamic programming to help. What you try to do is store for each pair of nodes from the two trees whether or not the subtree below one can be mapped to the other. When you're looking at whether a can go to b what you first do is map the children of a in to the children of b. If any can't go, then you know that the answer is no. If all can go, then sort the children of a from fitting in the least to the most places. Now do a brute force search for how to fit the one into the other. You'll tend to find your dead ends very quickly with this way of organizing the search.
However if the trees are large, if the one won't fit into the other you can spend a very, very long time figuring that fact out.
In the decrease-key operation of a Fibonacci Heap, if it is allowed to lose s > 1 children before cutting a node and melding it to the root list (promote the node), does this alter the overall runtime complexity? I think there are no changes in the complexity since the change in potential will be the same. But I am not sure if I am right.
And how can this be proved by the amortized analysis?
Changing the number of children that a node in the Fibonacci heap can lose does affect the runtime, but my suspicion is that if you're careful with how you do it you'll still get the same asymptotic runtime.
You're correct that the potential function will be unchanged if you allow each node to lose multiple children before being promoted back up to the root. However, the potential function isn't the source of the Fibonacci heap's efficiency. The reason that we perform cascading cuts (promoting multiple nodes back up to the root level during a decrease-key) is to ensure that a tree that has order n has a number of nodes in it that is exponential in n. That way, when doing a dequeue-min operation and coalescing trees together such that there is at most one tree of each order, the total number of trees required to store all the nodes is logarithmic in the number of nodes. The standard marking scheme ensures that each tree of order n has at least Θ(φn) nodes, where φ is the Golden Ratio (around 1.618...)
If you allow more nodes to be removed out of each tree before promoting them back to the root, my suspicion is that if you cap the number of missing children at some constant that you should still get the same asymptotic time bounds, but probably with a higher constant factor (because each tree holds fewer nodes and therefore more trees will be required). It might be worth writing out the math to see what recurrence relation you get for the number of nodes in each tree in case you want an exact value.
Hope this helps!
Since tree height is the main impediment to computational efficiency, a good strategy is to make the root of the shorter tree point to the root of the longer tree.
Does this really matter though? I mean if you did it the other way around (merge the longer tree into the shorter) the tree height will only increase by 1. Since an increase of 1 wouldn't make a real difference (would it?), does it really matter which tree is merged into which? Or is there an alternate reason for why the shorter tree is merged into the longer?
Note I am talking about disjoint sets.
It isn't really clear which kind of tree you are talking about (binary search trees, disjoint sets, or any n-ary tree).
But in any case, I think the reason is that although having a an increase of 1 isn't significant, if you do n mergers you end up with an increase of n. This can be significant if you have a data structure that needs lots of mergers (e.g. disjoint sets).
The quoatation lacks context. For example, in some tree structures single elemenet may have to be inserted one by one (possibly rebalancing the tree, for example - usually you want trees of height O(log n)); maybe this is meant: Then it is easier to insert fewer elements to the larger tree.
Obviously, if a height increase of 1 matters depends in party on how often the height is increased by one :-)
Edit: With disjoint sets, it is important that the smaller (lower) tree will be added to the bigger.