PMD Xpath to check presence Relational Expression - xpath

Wrote below Xpath , which to i want extend to include "<", ">=" etc.
//RelationalExpression[#Image=">" and
descendant :: PrimaryExpression/PrimaryPrefix/Name[#Image!="ABC" and #Image!="PQR"]]
Below rule does not give correct result,
//RelationalExpression[#Image=">" or #Image="<" and
descendant :: PrimaryExpression/PrimaryPrefix/Name[#Image!="ABC" and #Image!="PQR"]]
Any solution to correct the check?

Related

Java Grammar To AST

In java grammar I have a parser rule,
name
: Identifier ('.' Identifier)* ';'
;
How to get all the identifiers under a single AST tree node?
It seems impossible to me only with your lexer-parser.
For this, you will need the called: tree-walker.This third part of the parsing process will make you able to go through the generated AST and, with a counter, print the number of occurrences.
I let you a reference here in case you decide to implement it.
https://theantlrguy.atlassian.net/wiki/display/ANTLR3/Tree+construction
I hope this would help you!

Logical expression evaluation in Xpath

I have an XPATH expression of the following sort that's expected to return a boolean value:
xs:boolean(expression1 or expression2 or expression3)
If expression1 returns true, would the other expressions be evaluated?
In any case could any one point me to examples of how complex logical expressions are written efficiently in XPATH?
BTW: I am running the XPATH on MarkLogic.
In XPath 1.0 it's defined that the expressions are evaluated in order, left to right, until one of them returns true.
But the presence of xs:boolean (which is redundant) in your expression suggests you are using XPath 2.0, and XPath 2.0 processors are allowed to evaluate the subexpressions in any order. This is to allow database-style optimization: one of the subexpressions might be much faster to execute (or more likely to return true) than the others, perhaps because of database indexes, so an optimizer will evaluate that one first. But any decent optimizer will stop evaluation after the first expression that evaluates to "true".
I can't tell you specifically what MarkLogic does.
For anyone else trying this, the "or" operator in XPath must be lower-case.
In light of Michael Kay's comments on optimization, I can't say for sure whether MarkLogic chooses expression to evaluate first or goes left to right, but you can see how a particular XPath is evaluated. In Query Console (usually localhost:8000/qconsole), type in an expression, click the Profile tab, and Run.
//foo[xs:boolean(1 = 1 or 2 = 3)]
The profile tab shows that "1 = 1" is evaluated and "2 = 3" is not.

How to have nested conditions for PMD Xpath rules

My rule requires me to apply them only to methods without 'get' as part of their name. In another words, my rules need to apply to only non-getter methods in the class. I know to get a hold of all the non-getter methods, I can use
//MethodDeclarator[not(contains(#Image,'get'))]
However, I don't know the syntax about where I insert my logic for the rules. Is it like
//MethodDeclarator[
not(contains(#Image,'get'))
'Some Rule Statements'
]
I saw the use of . in the beginning of statement inside [] in some example code. what are they used for?
In my particular case, I need to combine following pieces together but so far I am unable to accomplish it yet.
Piece 1:
//PrimaryExpression[not(PrimarySuffix/Arguments)]
Piece 2:
//MethodDeclarator[not(contains(#Image,'get'))]
Piece 3:
//PrimaryExpression[PrimaryPrefix/#Label='this']
You need to have at least some basic knowledge/understanding of XPath.
I saw the use of . in the beginning of statement inside [] in some
example code. what are they used for?
[] is called predicate. It must contain a boolean expression. It must immediately follow a node-test. This specifies an additional condition for a node that satisfies the node-test to be selected.
For example:
/*/num
selects all elements named num that are children of the top element of the XML document.
However, if we want to select only such num elements, whose value is an odd integer, we add this additional condition inside a predicate:
/*/num[. mod 2 = 1]
Now this last expression selects all elements named num that are children of the top element of the XML document and whose string value represents an odd integer.
. denotes the context node -- this is the node that has been selected so-far (or the starting node off which the complete XPath expression is evaluated).
In my particular case, I need to combine following pieces together ...
You forgot to say in what way / how the three expressions should be combined. In XPath some of the frequently used "combinators" are the operators and, or, and the function not().
For example, if you want to select elements that are selected by all three provided XPath expressions, you can use the and operator:
//PrimaryExpression
[not(PrimarySuffix/Arguments)
and
PrimaryPrefix/#Label='this'
]

XPath to find all following siblings up until the next sibling of a particular type

Given this XML/HTML:
<dl>
<dt>Label1</dt><dd>Value1</dd>
<dt>Label2</dt><dd>Value2</dd>
<dt>Label3</dt><dd>Value3a</dd><dd>Value3b</dd>
<dt>Label4</dt><dd>Value4</dd>
</dl>
I want to find all <dt> and then, for each, find the following <dd> up until the next <dt>.
Using Ruby's Nokogiri I am able to accomplish this like so:
dl.xpath('dt').each do |dt|
ct = dt.xpath('count(following-sibling::dt)')
dds = dt.xpath("following-sibling::dd[count(following-sibling::dt)=#{ct}]")
puts "#{dt.text}: #{dds.map(&:text).join(', ')}"
end
#=> Label1: Value1
#=> Label2: Value2
#=> Label3: Value3a, Value3b
#=> Label4: Value4
However, as you can see I'm creating a variable in Ruby and then composing an XPath using it. How can I write a single XPath expression that does the equivalent?
I guessed at:
following-sibling::dd[count(following-sibling::dt)=count(self/following-sibling::dt)]
but apparently I don't understand what self means there.
This question is similar to XPath : select all following siblings until another sibling except there is no unique identifier for the 'stop' node.
This question is almost the same as xpath to find all following sibling adjacent nodes up til another type except that I'm asking for an XPath-only solution.
This is an interesting question. Most of the problems were already mentioned in #lwburk's answer and in its comments. Just to open up a bit more the complexity hidden in this question for a random reader, my answer is probably more elaborate or more verbose than OP needed.
Features of XPath 1.0 related to this problem
In XPath each step, and each node in the set of selected nodes, work independently. This means that
a subexpression has no generic way to access data that was computed in a previous subexpression or share data computed in this subexpression to other subexpressions
a node has no generic way to refer to a node that was used as a context node in a previous subexpression
a node has no generic way to refer to other nodes that are currently selected.
if everyone of the selected nodes must be compared to a same certain node, then that node must be uniquely definable in a way that is common to all selected nodes
(Well, in fact I'm not 100% sure if that list is absolutely correct in every case. If anyone has better knowledge of the quirks of XPath, please comment or correct this answer by editing it.)
Despite the lack of generic solutions some of these restrictions can be overcome if there is proper knowledge of the document structure, and/or the axis used previously can be "reverted" with another axis that serves as a backlink i.e. matches only nodes that were used as context node in the previous expression. A common example of this is when a parent axis is used after first using a child axis (the opposite case, from child to parent, is not uniquely revertible without additional information). In such cases, the information from previous steps is more precisely recreated at a later step (instead of accessing previously known information).
Unfortunately in this case I couldn't come up with any other solution to refer to previously known nodes except using XPath variables (that needs to be defined beforehand).
XPath specifies a syntax for referring a variable but it does not specify syntax for defining variables, the way how to define variables depends on the environment where XPath is used. Actually since the recommendation states that "The variable bindings used to evaluate a subexpression are always the same as those used to evaluate the containing expression", you could also claim that XPath explicitly forbids defining variables inside an XPath expression.
Problem reformulated
In your question the problem would be, when given a <dt>, to identify the following <dd> elements or the initially given node after the context node has been switched. Identifying the originally given <dt> is crucial since for each node in the node-set to be filtered, the predicate expression is evaluated with that node as the context node; so one cannot refer to the original <dt> in a predicate, if there is no way to identify it after the context has changed. The same applies to <dd> elements that are following siblings of the given <dt>.
If you are using variables, one could debate is there a major difference between 1) using XPath variable syntax and a Nokogiri specific way to declare that variable or 2) using Nokogiri extended XPath syntax that allows you to use Ruby variables in an XPath expression. In both cases the variable is defined in environment specific way and the meaning of the XPath is clear only if the definition of the variable is also available. Similar case can be seen with XSLT where in some cases you could make a choice between 1) defining a variable with <xsl:variable> prior to using your XPath expression or 2) using current() (inside your XPath expression) which is an XSLT extension.
Solution using nodeset variables and Kaysian method
You can select all the <dd> elements following the current <dt> element with following-sibling::dd (set A). Also you can select all the <dd> elements following the next <dt> element with following-sibling::dt[1]/following-sibling::dd (set B). Now a set difference A\B leaves the <dd> elements you actually wanted (elements that are in set A but not in set B). If variable $setA contains nodeset A and variable $setB contains nodeset B, the set difference can be obtained with (a modification of) Kaysian technique:
dds = $setA[count(.|$setB) != count($setB)]
A simple workaround without any variables
Currently your method is to select all the <dt> elements and then try to couple the value of each such element with values of corresponding <dd> elements in a single operation. Would it be possible to convert that coupling logic to work the other way round? So you would first select all <dd> elements and then for each <dd> find the corresponding <dt>. This would mean that you end up accessing same <dt> elements several times and with every operation you add only one new <dd> value. This could affect performance and the Ruby code could be more complicated.
The good side is the simplicity of the required XPath. When given a <dd> element, finding the corresponding <dt> is amazingly simple: preceding-sibling::dt[1]
As applied to your current Ruby code
dl.xpath('dd').each do |dd|
dt = dd.xpath("preceding-sibling::dt[1]")
## Insert new Ruby magic here ##
end
One possible solution:
dl.xpath('dt').each_with_index do |dt, i|
dds = dt.xpath("following-sibling::dd[not(../dt[#{i + 2}]) or " +
"following-sibling::dt[1]=../dt[#{i + 2}]]")
puts "#{dt.text}: #{dds.map(&:text).join(', ')}"
end
This relies on a value comparison of dt elements and will fail when there are duplicates. The following (much more complicated) expression does not depend on unique dt values:
following-sibling::dd[not(../dt[$n]) or
(following-sibling::dt[1] and count(following-sibling::dt[1]|../dt[$n])=1)]
Note: Your use of self fails because you're not properly using it as an axis (self::). Also, self always contains just the context node, so it would refer to each dd inspected by the expression, not back to the original dt

XPATH Multiple Element Filters

I have the following sample XML structure:
<SavingAccounts>
<SavingAccount>
<ServiceOnline>yes</ServiceOnline>
<ServiceViaPhone>no</ServiceViaPhone>
</SavingAccount>
<SavingAccount>
<ServiceOnline>no</ServiceOnline>
<ServiceViaPhone>yes</ServiceViaPhone>
</SavingAccount>
</SavingAccounts>
What I need to do is filter the 'SavingAccount' nodes using XPATH where the value of 'ServiceOnline' is 'yes' or the value of 'ServiceViaPhone' is yes.
The XPATH should return me two rows!! I can filter 'SavingAccount' nodes where both of the element values are yes like the following XPATH sample, but what I want to do is an or element value comparison???
/SavingAccounts/SavingAccount/ServiceOnline[text()='yes']/../ServiceViaPhone[text()='yes']/..
This is a very fundamental XPath feature: composing a number of conditions with the logical operators and, or, and the function not().
and has a higher priority than or and both operators have lower priority than the relational and equality operators (=, !=, >, >=, < and <=).
So, it is safe to write: A = B and C = D
Some most frequent mistakes made:
People write AND and/or OR. Remember, XPath is case-sensitive.
People use the | (union) operator instead of or
Lastly, here is my solution:
/SavingAccounts/SavingAccount
[ServiceOnLine='yes' or ServiceViaPhone='yes']
/SavingAccounts/SavingAccount[(ServiceOnLine='yes') or (ServiceViaPhone='yes')]
Will
/SavingAccounts/SavingAccount[ServiceOnline/text()='yes' or ServiceViaPhone/text()='yes']
do the trick?
I have no XPath evaluator handy at the moment.
EDIT:
If I remember correctly, you don't need the text(), so
[ServiceOnline='yes' or ServiceViaPhone='yes']
should be sufficient, and more readable.
EDIT:
Yes, of course, 'or' for predicate expressions, my bad.

Resources