Tree traversal without recursive / stack usage (C#)? - performance

I'm working with WPF and I'm developing a complex usercontrol, which is composed of a tree with rich functionality etc.
For this purpose I used a View-Model design pattern, because some operations couldn't be achieved directly in WPF. So I take the IHierarchyItem (which is a node and pass it to this constructor to create a tree structure)
private IHierarchyItemViewModel(IHierarchyItem hierarchyItem, IHierarchyItemViewModel parent)
{
this.hierarchyItem = hierarchyItem;
this.parent = parent;
List<IHierarchyItemViewModel> l = new List<IHierarchyItemViewModel>();
foreach (IHierarchyItem item in hierarchyItem.Children)
{
l.Add(new IHierarchyItemViewModel(item, this));
}
children = new ReadOnlyCollection<IHierarchyItemViewModel>(l);
}
The problem is that this constructor takes about 3 seconds !! for 200 items on my dual-core.
Am I doing anythig wrong or recursive constructor call is that slow?
Thank you very much!

OK I found a non recursive version by myself, although it uses the Stack.
It traverses the whole tree:
Stack<MyItem> stack = new Stack<MyItem>();
stack.Push(root);
while (stack.Count > 0)
{
MyItem taken = stack.Pop();
foreach (MyItem child in taken.Children)
stack.Push(MyItem);
}

There should be nothing wrong with a recursive implementation of a tree, especially for such a small number of items. A recursive implementation is sometimes less space efficient, and slightly less time efficient, but the code clarity often makes up for it.
It would be useful for you to perform some simple profiling on your constructor. Using one of the suggestions from: http://en.csharp-online.net/Measure_execution_time you could indicate for yourself how long each piece is taking.
There is a possibility that one piece in particular is taking a long time. In any case, that might help you narrow down where you are really spending the time.

Related

Does Java optimize new HashSet(someHashSet).contains() to be O(1)?

This class is an example of where the issue arises:
public class ContainsSet {
private static HashSet<E> myHashSet;
[...]
public static Set<E> getMyHashSet() {
return new HashSet<E>(myHashSet);
}
public static boolean doesMyHashSetContain(E e) {
return myHashSet.contains(e);
}
}
Now imagine two possible functions:
boolean method1() {
return ContainsSet.getMyHashSet().contains(someE);
}
boolean method2() {
return ContainsSet.doesMyHashSetContain(someE);
}
Now is my question whether or not method 1 will have the same time complexity after Java optimization as method 2.
(I used HashSet instead of just Set to emphasize that myHashSet.contains(someE) has complexity O(1).)
Without optimization it would not. Although .contains() has complexity O(1), the new HashSet<E>(myHashSet) has complexity O(n), which would give method 1 a complexity of O(n) + O(1) = O(n), which is horrible compared to the beloved O(1).
The reason why I this issue is imported is because I am teached not to return lists or sets if you will not allow an external class to change the contents of it. Returning a copy is an obvious solution, but it can be really time-consuming.
No, javac does not (and cannot) optimize this away. It's required to emit the byte code you describe in your source to this level. And the JVM will not be nearly intelligent enough to optimize this away. It's way too likely to have side effects to prove.
Don't return a copy of the HashSet if you want immutability. Wrap it in an unmodifiable wrapper: Collections.unmodifiableSet(myHashSet)
What can I say here but that creating a new HashSet and populating it via the constructor is expensive!
Java will not "optimize away" this work: Even though you and I know it would give the same result as "passing through" the contains() call, java can not know this.
No. That is beyond optimization. You returned a new object and you could use it in somewhere else, Java is not supposed to omit it. A new HashSet will be created.
This is not a good practice to return a copy. It's not only time consuming but also space consuming. As Sean said, you can wrap with unmodifiableSet or you can wrap it in your own class.
You can try this:
public static Set<E> getMyHashSet() {
return Collection.unmodifiableSortedSet(myHashSet);
}
Note: use that method will return a view of your set, not a copy.

What are the drawbacks of writing an algorithm in a class or in an object (in Scala)?

class A {
def algorithmImplementation (...) = { ... }
}
object A {
def algorithmImplementation (...) = { ... }
}
In which circumstances should the class be used and in which should the object be used (for implementing an algorithm, e.g. Dijkstra-Algorithm, as shown above) ?
Which criterias should be considered when making such a decision ?
At the moment, I can not really see what the beneftis of using a class are.
If you only have one implementation, this can largely be a judgement call. You mentioned Dijkstra's algorithm, which runs on a graph. Now you can write that algorithm to take a graph object as an explicit parameter. In that case, the algorithm would presumably appear in the Graph singleton object. Then it might be called as something like Graph.shortestPath(myGraph,fromNode,toNode).
Or you can write the algorithm in the Graph class, in which case it no longer takes the graph as an explicit parameter. Now it is called as something like myGraph.shortestPath(fromNode,toNode).
The latter case probably makes more sense when there is one main argument (eg, the graph), especially one that serves as a kind of context for the algorithm. But it may come down to which syntax you prefer.
However, if you have multiple implementations, the balance tips more toward the class approach, especially when the choice of which implementation is better depends on the choice of representation. For example, you might have two different implementations of shortestPath, one that works better on adjacency matrices and one that works better on adjacency lists. With the class approach, you can easily have two different graph classes for the two different representations, and each can have its own implementation of shortestPath. Then, when you call myGraph.shortestPath(fromNode,toNode), you automatically get the right implementation, even if you don't know whether myGraph uses adjacency matrices or adjacency lists. (This is kind of the whole point of OO.)
Classes can have subclasses which override implementation, objects cannot be subclassed.
Classes can also be type parametric where objects cannot be.
There's only ever one instance of the object, or at least one instance per container. A class has multiple instances. That means that the class can be parameterized with values
class A(param1 : int, param2 : int) {
def algorithmImplementation(arg : List[String]) = // use arg and params
}
And that can be reused like
val A42_13 = new A(42, 13)
val result1 = A42_13.algorithmImplementation(List("hello", "world"))
val result2 = A42_13.algorithmImplementation(List("goodbye", "cruel", "world"))
To bring all this home relative to your example of Djikstra's algorithm: imagine you want to write one implementation of the algorithm that is reusable across multiple node types. Then you might want to parameterize by the Node type, the type of metric used to measure distance, and the function used to calculate distance.
val Djikstra[Node, Metric <: Comparable[Metric]](distance : (Node, Node) => Metric) {
def compute(node : Node, nodes : Seq[Node]) : Seq[Metric] = {...}
}
You create one instance of Djikstra per distinct type of node/metric/distance function that you use in your program and reuse that instance without having to pass all that information in everytime you compute Djikstra.
In summary, classes are more flexible. Use them when you need the flexibility. Otherwise objects are fine.

Why is Document.html() so slow?

I was under the impression that the most costly method in Jsoup's API is parse().
But I just discovered that Document.html() could be even slower.
Given that the Document is the output of parse() (i.e. this is after parsing), I find this surprising.
Why is Document.html() so slow?
Answering myself. The Element.html() method is implemented as:
public String html() {
StringBuilder accum = new StringBuilder();
html(accum);
return accum.toString().trim();
}
Using StringBuilder instead of String is already a good thing, and the use of StringBuilder.toString() and String.trim() may not explain the slowness of Document.html(), even for a relatively large document.
But in the middle, our method calls an overloaded version, Element.html(StringBuilder) which loops through all child nodes in the document:
private void html(StringBuilder accum) {
for (Node node : childNodes)
node.outerHtml(accum);
}
Thus if the document contains lots of child nodes, it will be slow.
It would be interesting to see whether there could be a faster implementation of this.
For example, if Jsoup stores a cached version of the raw html that was provided to it via Jsoup.parse(). As an option of course, to maintain backward compatibility and small footprint in memory.

How to represent data to be used for DFS/BFS

I was assigned a problem to solve using various search techniques. The problem is very similar to the Escape From Zurg problem or the Bridge and Torch problem. My issue is that I am lost as to how to represent the data as a tree.
This is my guess as to how to do it, but it doesn't make much sense for searching.
Another way could be to use a binary tree sorted by their walking time. However, I'm still not sure if I'm attacking this problem correctly since search algorithms don't necessarily require binary trees.
Any tips on representing this data would be appreciated.
Generally when you are using a tree search to solve a problem, each node represents some possible "state" of the world (who's on what side of the bridge, for example), and the children of each node represent all possible "successor states" (new states that can be reached in one move from the previous state). A depth-first search then represents trying one option until it dead-ends, then backing up to the last state where another option was available and trying it out. A breadth-first search represents trying out lots of options in parallel and seeing when the first of them find the goal node.
In terms of the actual way of encoding this, you would represent this as a multiway tree. Each node would probably contain the current state, plus a list of pointers to child nodes.
Hope this helps!
U could use something like this:
public class Node
{
public int root;
public List<Node> neighbours;
public Node(int x)
{
root=x;
}
public void setNeighboursList(List<Node> l)
{
neighbours = l;
}
public void addNeighbour(Node n)
{
if(neighbours==null) neighbours = new ArrayList<Node>();
neighbours.add(n);
}
...
}
public class Tree
{
public Node root;
....
}

Binding to a Binary Tree in Knockoutjs

I'm looking for some advice on binding knockoutjs to a binary tree with dependentObservables.
I'm working on a web project that involves a binary tree in javascript. The binary tree implementation has been completed, and I'm running into a problem using it with Knockoutjs.
The binary tree doesn't really have any properties, only a few methods (addNode, inorderTraversal, getLength, getDepth, toJSON, etc), so I have no clue how to set it up as observable. I'd really just love to have a few dependentObservables that get some information from the binary tree.
As a simple example, I'd like to at least set up a dependentObservable for the length of the tree. It just never seems to get fired...
viewModel.TreeLength = ko.dependentObservable(function(){
return this.bTree().getLength();}, viewModel);
The following adds the node to the tree, but the TreeLength never fires.
viewModel.bTree().addNode(new Node('some data'));
RP Niemeyer pointed me to the solution with valueHasMutated. The first round was just adding a call to viewModel.bTree.valueHasMutated() every time we worked with the tree.
Once this was proven to work, the code was refactored to pass a callback method to the tree, so that any time the tree changed, the callback would be invoked. We ran into some problems with closures, but finally got to the following:
function getCallBack(o)
{
var obj = o;
var func = function()
{
obj.bTree.valueHasMutated();
}
return func;
}
this.bTreeChanged = getCallBack(this);
model.bTree = new BinaryTree(model.treeData, this.bTreeChanged);

Resources