Confusing sequence of code - visual-studio-2010

I am attempting to use HTML Agility Pack to do parsing on some webpages.
here is a line of code I came across in an example.
var div = document.DocumentNode.Descendants().Where(n => n.Name == "div")
the tooltip says "(parameter) HTMLNode n" when placed over n in visual Studio
I am uncertain what n is and what this line does

This code selects all the descendants of the root node of the document with tag name == "div"
document.DocumentNode selects the root node
.Descendants() selects all the nodes in the root node (not only direct children but all)
.Where() selects only those who meets some criteria
n => n.Name == "div" is the criteria itself that means "if n is a node then the criteria is true when node's Name is equal to "div"

Related

Comparing two nodes in XPath

I am writing an Xpath Query to be used as a rule in PMD.
Now
//Method/ModifierNode[Annotation[#Image = 'Future']]/..[#Image = 'randomMethod']]
gives me one node and
//ForEachStatement
//MethodCallExpression
[#MethodName = 'randomMethod']
gives me another.
I want to compare these two and see whether the name of the node in the first query and the name of the node in the second query are same or not.
I am doing this
//ForEachStatement
//MethodCallExpression
[#MethodName = //Method/ModifierNode[Annotation[#Image = 'Future']]/..[#Image]]
This is not working at all and is returning zero matched nodes.
You have an issue with the types of values you are comparing
#MethodName is a string.
//Method/ModifierNode[Annotation[#Image = 'Future']]/..[#Image] selects a node (making sure it has a non-empty image).
So when comparing both, it will always be false. You want to get the name of the method node in the second selector, so you can compare strings. You can do so by doing…
//Method/ModifierNode[Annotation[#Image = 'Future']]/../#Image
So your XPath should look like
//ForEachStatement
//MethodCallExpression
[#MethodName = //Method/ModifierNode[Annotation[#Image = 'Future']]/../#Image]

Tree data structure in Julia

I need to implement a simple (but not binary) tree in Julia. Basically, each node needs to have an integer ID, and I need a convenient way to get a list of children for a node + add child to an existing node by ID.
e.g. 0->1->(2->(3,4,5),6)
where each number represents a node, I need the functions children(2) and add(7 as child of 4).
I am aware that similar tree implementations can be found for other languages, but I am pretty new to OOP/classes/data structures and not managing to "translate" them to Julia.
You didn't state whether you want the IDs to be assigned automatically as new nodes are added, or if you want to specify them when you add the children (which would involve some form of more complicated lookup).
If the IDs can be assigned, you could implement a tree structure as follows:
type TreeNode
parent::Int
children::Vector{Int}
end
type Tree
nodes::Vector{TreeNode}
end
Tree() = Tree([TreeNode(0, Vector{Int}())])
function addchild(tree::Tree, id::Int)
1 <= id <= length(tree.nodes) || throw(BoundsError(tree, id))
push!(tree.nodes, TreeNode(id, Vector{}()))
child = length(tree.nodes)
push!(tree.nodes[id].children, child)
child
end
children(tree, id) = tree.nodes[id].children
parent(tree,id) = tree.nodes[id].parent
Otherwise, you might want to use Dict{Int,TreeNode} to store the tree nodes.

XPath: Select following siblings until certain class

I have the following html snippet:
<table>
<tr>
<td class="foo">a</td>
<td class="bar">1</td>
<td class="bar">2</td>
<td class="foo">b</td>
<td class="bar">3</td>
<td class="bar">4</td>
<td class="bar">5</td>
<td class="foo">c</td>
<td class="bar">6</td>
<td class="bar">7</td>
</tr>
</table>
I'm looking for a XPath 1.0 expression that starts at a .foo element and selects all following .bar elements before the next .foo element.
For example: I start at a and want to select only 1 and 2.
Or I start at b and want to select 3, 4 and 5.
Background: I have to find an XPath expression for this method (using Java and Selenium):
public List<WebElement> bar(WebElement foo) {
return foo.findElements(By.xpath("./following-sibling::td[#class='bar']..."));
}
Is there a way to solve the problem?
The expression should work for all .foo elements without using any external variables.
Thanks for your help!
Update: There is apparently no solution for these special circumstances. But if you have fewer limitations, the provided expressions work perfectly.
Good question!
The following expression will give you 1..2, 3..5 or 6..7, depending on input X + 1, where X is the set you want (2 gives 1-2, 3 gives 3-.5 etc). In the example, I select the third set, hence it has [4]:
/table/tr[1]
/td[not(#class = 'foo')]
[
generate-id(../td[#class='foo'][4])
= generate-id(
preceding-sibling::td[#class='foo'][1]
/following-sibling::td[#class='foo'][1])
]
The beauty of this expression (imnsho) is that you can index by the given set (as opposed to index by relative position) and that is has only one place where you need to update the expression. If you want the sixth set, just type [7].
This expression works for any situation where you have siblings where you need the siblings between any two nodes of the same requirement (#class = 'foo'). I'll update with an explanation.
Replace the [4] in the expression with whatever set you need, plus 1. In oXygen, the above expression shows me the following selection:
Explanation
/table/tr[1]
Selects the first tr.
/td[not(#class = 'foo')]
Selects any td not foo
generate-id(../td[#class='foo'][4])
Gets the identity of the xth foo, in this case, this selects empty, and returns empty. In all other cases, it will return the identity of the next foo that we are interested in.
generate-id(
preceding-sibling::td[#class='foo'][1]
/following-sibling::td[#class='foo'][1])
Gets the identity of the first previous foo (counting backward from any non-foo element) and from there, the first following foo. In the case of node 7, this returns the identity of nothingness, resulting in true for our example case of [4]. In the case of node 3, this will result in c, which is not equal to nothingness, resulting in false.
If the example would have value [2], this last bit would return node b for nodes 1 and 2, which is equal to the identity of ../td[#class='foo'][2], returning true. For nodes 4 and 7 etc, this will return false.
Update, alternative #1
We can replace the generate-id function with a count-preceding-sibling function. Since the count of the siblings before the two foo nodes is different for each, this works as an alternative for generate-id.
By now it starts to grow just as wieldy as GSerg's answer, though:
/table/tr[1]
/td[not(#class = 'foo')]
[
count(../td[#class='foo'][4]/preceding-sibling::*)
= count(
preceding-sibling::td[#class='foo'][1]
/following-sibling::td[#class='foo'][1]/preceding-sibling::*)
]
The same "indexing" method applies. Where I write [4] above, replace it with the nth + 1 of the intersection position you are interested in.
If the current node is one of the td[#class'foo'] elements you can use the below xpath to get the following td[#class='bar'] elements, which are preceding to next td of foo:
following-sibling::td[#class='bar'][generate-id(preceding-sibling::td[#class='foo'][1]) = generate-id(current())]
Here, you select only those td[#class='bar'] whose first preceding td[#class='foo'] is same as the current node you are iterating on(confirmed using generate-id()).
So you want an intersection of two sets:
following-sibling::td[#class='bar'] that follow your starting td[#class='foo'] node
preceding-sibling::td[#class='bar'] that precede the next td[#class='foo'] node
Given the formula from the linked question, it is not difficult to get:
//td[1]/following-sibling::td[#class='bar'][count(. | (//td[1]/following-sibling::td[#class='foo'])[1]/preceding-sibling::td[#class='bar']) = count((//td[1]/following-sibling::td[#class='foo'])[1]/preceding-sibling::td[#class='bar'])]
However this will return an empty set for the last foo node because there is no next foo node to take precedings from.
So you want a difference of two sets:
following-sibling::td[#class='bar'] that follow your starting td[#class='foo'] node
following-sibling::td[#class='bar'] that follow the next td[#class='foo'] node
Given the formula from the linked question, it is not difficult to get:
//td[1]/following-sibling::td[#class='bar'][
count(. | (//td[1]/following-sibling::td[#class='foo'])[1]/following-sibling::td[#class='bar'])
!=
count((//td[1]/following-sibling::td[#class='foo'])[1]/following-sibling::td[#class='bar'])
]
The only amendable bit is the starting point, //td[1] (three times).
Now this will properly return bar nodes even for the last foo node.
The above was written under impression that you need to have a single XPath query and nothing more. Now that it's clear you don't, you can easily solve your problem with more than one XPath query and some manual list filtering on referential equality, as I already mentioned in a comment.
In C# that would be:
XmlNode context = xmlDocument.SelectSingleNode("//td[8]");
XmlNode nextFoo = context.SelectSingleNode("(./following-sibling::td[#class='foo'])[1]");
IEnumerable<XmlNode> result = context.SelectNodes("./following-sibling::td[#class='bar']").Cast<XmlNode>();
if (nextFoo != null)
{
// Intersect filters using referential equality by default
result = result.Intersect(nextFoo.SelectNodes("./preceding-sibling::td[#class='bar']").Cast<XmlNode>());
}
I'm sure it's trivial to convert to Java.
Pretty straightforward (example for 'a' td) but not very optimal:
//td[
#class='bar' and
preceding-sibling::td[#class='foo'][1][text() = 'a'] and
(
not(following-sibling::td[#class='foo']) or
following-sibling::td[#class='foo'][1][preceding-sibling::td[#class='foo'][1][text() = 'a']]
)
]

Pseudo code to check if binary tree is a binary search tree - not sure about the recursion

I have homeowork to write pseudo code to check if a valid binary tree is a search binary tree.
I created an array to hold the in-order values of the tree. if the in-order values are in decreasing order it means it is indeed BST. However I've got some problem with the recursion in the method InOverArr.
I need to update the index of the array in order to submit the values to the array in the order they are at the tree.
I'm not sure the index is really updated properly during the recursion.. is it or not? and if you see some problem can you help me fix this? thanks a lot
pseudo code
first function
IsBST(node)
size ← TreeSize(node)
create new array TreeArr of Size number of cells
index ← 0
few comments:
now we use the IN_ORDER procedure with a small variation , I called the new version of the procedure: InOrderArr
the pseudo code of InOrderArr is described below IsBST
InOrderArr(node, TreeArr, index)
for i from 1 to size-1 do
if not (TreeArr[i] > TreeArr[i-1]) return
false
return true
second function
InOrderArr (node, Array, index)
if node = NULL then return
else
InOrderArr (node.left, Array, index)
treeArr[index] = node.key
index ← index + 1
InOrderArr (node.right, Array, index)
Return
Your code is generally correct. Just three notes.
The correctness of the code depends on the implementation, specifically on the way of index handling. Many programming languages pass arguments to subroutines by value. That means the subroutine receives a copy of the value and modifications made to the parameter have no effect on the original value. So incrementing index during execution of InOrderArr (node.left, Array, index) would not affect the position used by treeArr[index] = node.key. As a result only the rightmost path would be stored in the array.
To avoid that you'll have to ensure that index is passed by reference, so that incrementation done by a callee advances the position used later by a caller.
BST is usually defined so that the left subtreee of a node contains keys that are less than that node's key, and the right subtree contains nodes with greater keys – see Wikipedia's article on BST. Then the inorder traversal retrieves keys in ascending order. Why do you expect descending order?
Possibly it would be more efficient to drop the array and just recursively test a definition condition of BST?
Whenever we follow a left link we expect keys which are less than the current one. Whenever we follow the right link we expect keys greater the the current one. So for most subtrees there is some interval of keys values, defined by some ancestor nodes' keys. Just track those keys and test whether the key falls inside the current valid interval. Be sure to handle 'no left end defined' condition on the letfmost path and 'no right end' on the rightmost path of the tree. At the root node there's no ancestor yet, so the root key is not tested at all (any value is OK).
EDIT
C code draft:
// Test a node against its closest left-side and right-side ancestors
boolean isNodeBST(NODE *lt, NODE *node, NODE *rt)
{
if(node == NULL)
return true;
if(lt != NULL && node->key < lt->key)
return false;
if(rt != NULL && node->key > rt->key)
return false;
return
isNodeBST(lt, node->left, node) &&
isNodeBST(node, node->right, rt);
}
boolean isTreeBST(TREE *tree)
{
return isNodeBST( NULL, tree->root, NULL);
}

xpath: decipher this xpath?

what does this xpath mean ? can someone decipher this ?
//h1[following-sibling::*[1][self::b]]
Select every h1 element (in the document of the context node) that is immediately followed by a b element (with no other intervening element, though there may be intervening text).
Breaking it down:
//h1
Select every h1 element that is a descendant of the root node of the document that contains the context node;
[...]
filter out any of these h1 elements that don't meet the following criteria:
[following-sibling::*[1]...]
such that the first following sibling element passes this test:
[self::b]
self is a b element. Literally, this last test means, "such that when I start from the context node and select the self (i.e. the context node) subject to the node test that filters out everything except elements named b, the result is a non-empty node set."

Resources