Can anyone please help me, I want to use or operator in my xpath expression to select all input or all a from an html page.
my expression is like this:
document.DocumentNode.SelectNodes("//input or //a");
But I'm having errors.
You can use the union operator:
//input | //a
Or an expression like this, which may perform somewhat better:
//*[self::input or self::a]
The or operator is boolean OR in XPath, so //input or //a is a boolean expression which will return true if either of the node sets //input and //a are non-empty (i.e. within your source document there is at least one input element or one a element or both) and false otherwise.
Instead you're looking for the | operator which is the "union" operation on node sets.
//input | //a
will give you a set containing all the input elements and all the a elements.
Related
Take this example Ruby expression:
case
when 3 then "foo"
when 4 then "bar"
end
I was surprised to learn that this is not a syntax error. Instead, it evaluates to "foo"!
Why? What are the syntax and evaluation rules being applied here?
In this form of the case expression, the then clause associated with the lexically first when clause that evaluates to a truthy value is evaluated.
See clause b) 2) of §11.5.2.2.4 Semantics of the ISO Ruby Language Specification (bold emphasis mine):
Semantics
A case-expression is evaluated as follows:
a) […]
b. The meaning of the phrase “O is matching” in Step c) is defined as follows:
[…]
If the case-expression is a case-expression-without-expression, O is matching if and only if O is a trueish object.
c) Take the following steps:
Search the when-clauses in the order they appear in the program text for a matching when-clause as follows:
i) If the operator-expression-list of the when-argument is present:
I) For each of its operator-expressions, evaluate it and test if the resulting value is matching.
II) If a matching value is found, other operator-expressions, if any, are not evaluated.
ii) If no matching value is found, and the splatting-argument (see 11.3.2) is present:
I) Construct a list of values from it as described in 11.3.2. For each element of the resulting list, in the same order in the list, test if it is matching.
II) If a matching value is found, other values, if any, are not evaluated.
iii) A when-clause is considered to be matching if and only if a matching value is found in its when-argument. Later when-clauses, if any, are not tested in this case.
If one of the when-clauses is matching, evaluate the compound-statement of the then-clause of this when-clause. The value of the case-expression is the resulting value.
If none of the when-clauses is matching, and if there is an else-clause, then evaluate the compound-statement of the else-clause. The value of the case-expression is the resulting value.
Otherwise, the value of the case-expression is nil.
The RDoc documentation, while much less precise, also states that truthiness is the selection criteria, when the condition is omitted; and lexical ordering determines the order in which when clauses are checked (bold emphasis mine):
case
The case statement operator. Case statements consist of an optional condition, which is in the position of an argument to case, and zero or more when clauses. The first when clause to match the condition (or to evaluate to Boolean truth, if the condition is null) "wins", and its code stanza is executed. The value of the case statement is the value of the successful when clause, or nil if there is no such clause.
It is by design that case statement without a value to match against behaves as an if statement.
It is actually the same as writing:
if 3
'foo'
elsif 4
'bar'
end
I have a short question. How can I display only the elements who's value is = '.'
I have no idea how to do that. I'm newbie in XPath.
<SalesTransaction>
<TransactionHeader>
<TransactionHeaderFields>
<WrntyID>a</WrntyID>
<ExternalID/>
<Type>.</Type>
<Status>
Submited
</Status>
<CreationDate>
2015-01-12
</CreationDate>
<Date>
2015-01-12T11:41:29Z
</Date>
<DeliveryDate>
2015-01-12
</DeliveryDate>
<Remark/>
</TransactionHeaderFields>
<CatalogFields>
<CatalogID>
saf
</CatalogID>
</CatalogFields>
</TransactionHeader>
</SalesTransaction>
Ignoring any of the structure and just looking for any element who's text() is equal to ".", you could use:
//*[text()='.']
//* will search through the entire tree structure, looking for any element at any level
[text()='.'] is a predicate filter (kind of like a WHERE clause in SQL) that performs a test on each of those matched elements. Only the ones that have a text() node who's value is equal to . will evaluate to true() and will be what is left.
It's not not he most efficient XPath expression, but may be good enough for what you need.
What I need doesn't quite seem to match what other articles of a similar title are about.
I need, using Xpath 1, to be able to get node a, or node b, excusively, in that order.
That is, node a if it exists, otherwise, node b.
an xpath expression such as :
expression | expression
will get me both in the case they both exist. that is not what I want.
I could go:
(expression | expression)[last()]
Which does in fact gget me what I need (in my case), but seems to be a bit inefficient, because it will evaluate both sides of the expression before the last result is selected.
I was hoping for an expression that is going to stop working once the left side succeeds.
A more concrete example of XML
<one>
<two>
<three>hello</three>
<four>bye</four>
</two>
<blahfive>again</blahfive>
</one>
and the xpath that works (but inefficient):
(/one/*[starts-with(local-name(.), 'blah')] | .)[last()]
To be clear, I would like to grab the immediate child node of 'one' which starts with 'blah'. However, if it doesn't exist, I would like only the current node.
If the 'blah' node does exist, I do not want the current node.
Is there a more efficient way to achieve this?
I need, using Xpath 1, to be able to get node a, or node b,
excusively, in that order. That is, node a if it exists, otherwise,
node b.
an xpath expression such as :
expression | expression
will get me both in the case they both exist. that is not what I want.
I could go:
(expression | expression)[last()]
Which does in fact gget me what I need (in my case),
This statement is not true.
Here is an example. Let us have this XML document:
<one>
<a/>
<b/>
</one>
Expression1 is:
/*/a
Expression2 is:
/*/b
Your composite expression:
(Expression1 | Expression2)[last()]
when we substitute the two expressions above is:
(/*/a | /*/b)[last()]
And this expression actually selects b -- not a -- because b is the last of the two in document order.
Now, here is an expression that selects just a if it exists, and selects b only if a doesn't exist -- regardless of document order:
/*/a | /*/b[not(/*/a)]
When this expression is evaluated on the XML document above, it selects a, regardless of its document order -- try swapping in the XML document above the places of a and b to confirm that in both cases the element that is selected is a.
To summarize, one expression that selects the wanted node regardless of any document order is:
Expression1 | Expression2[not(Expression1)]
Let us apply this general expression in your case:
Expression1 is:
/one/*[starts-with(local-name(.), 'blah')]
Expression2 is:
self::node()
The wanted expression (after substituting Expression1 and Expression2 in the above general expression) is:
/one/*[starts-with(local-name(.), 'blah')]
|
self::node()[not(/one/*[starts-with(local-name(.), 'blah')])]
What's exactly the difference between ['#'] and [.='#']? Is there any difference at all?
In e.g. the following expressions:
<xsl:template match="a/#href[.='#']">...</xsl:template>
<xsl:template match="a/#href['#']">...</xsl:template>
A predicate filters, if the contained expression is not true. [.='#'] tests if the string content of the current context (.) equals #, thus the first template would return all #href attributes for links like ....
The second template does not contain a boolean statement, and it also isn't numerical (so it would be a positional test). It will be evaluated as given by the boolean function:
Function: boolean boolean(object)
The boolean function converts its argument to a boolean as follows:
a number is true if and only if it is neither positive or negative
zero nor NaN
a node-set is true if and only if it is non-empty
a string is true if and only if its length is non-zero
an object of a type other than the four basic types is converted to a
boolean in a way that is dependent on that type
Here, we have a non-empty string with effective boolean value true, thus the predicate in your second template will never filter anything.
A predicate like in //a[#href] on the other hand would filter for all links containing an #href attribute (here, we filter for a node-set).
Let's say I have two expressions: //div[#class="foo"] and //span[#class="foo"]. Is it possible to "combine" them, like so:
//(div | span)[#class="foo"]
Or can I only take the union of the two complete expressions?
//div[#class="foo"] | //span[#class="foo"]
A more idiomatic (and dare I say readable) way to get all of the div and span elements having class="foo" is this:
//*[(self::div or self::span) and #class="foo"]
In English:
Select all elements that are themselves a div or a span and that have a class attribute whose value is 'foo'
As for your original question, the following expressions return equivalent results:
(//div | //span)[#class="foo"]
//div[#class="foo"] | //span[#class="foo"]
The first gives you the set that is the union of all the div and span elements in the document, further filtered to include only those having class="foo" while the latter gives you the union of 1) the set of all div elements having class="foo" and 2) the set of all span elements having class="foo".
It should be fairly obvious that those two sets contain the same thing.
This construct works:
(//golfer | //batter)[#ID="2" or #ID="3"]
...much to my astonishment.