Finding an exact match of an inner text using XPath - xpath

What XPath syntax can I use to find an anchor tag where the inner text is "abc". The closest I can get to this is:
SelectSingleNode(".//a[starts-with(., \"abc\")]");
I couldn't find any "equals" function to use.

Try the following
SelectSingleNode("//a[.='abc']");
// ususally means you intend to search the whole tree, why would you add . before that?

Related

XDMP-REGEX: (err:FORX0002) - String transformation with Regular expressions

I am working on xquery requirement to identify the xml tag name() from the XML document using the regex. Later , will do the transformation on data.It searches the entire document and If i found match, am doing string :replace using xquery/xpath.
Please find some sample code which am looking for.
let $full-doc := fn:doc($uri)
if(fn:matches($full-doc,"<Hyperlink\b[^\>]*?>([A-Z][a-z]{2} [0-3]?[0-9]
[12][890][0-9]{2})</Hyperlink>"))
then $full-doc
else "regex is not working"
I am getting the following Error.
regex-match :
[1.0-ml] XDMP-REGEX: (err:FORX0002) fn:matches(fn:doc("44215.xml"), "
<Hyperlink\b[^\>]*?>([A-Z][a-z]{2} [0-3]?[0-9] [12][890][0-9]{2}...") -
- Invalid regular expression
Could some one please explain why my regex is not working ?
Looking at your requirement:
I am working on xquery requirement to identify the xml tag name() from the XML document using the regex.
You are going about this entirely the wrong way. XQuery doesn't see the lexical XML, it sees a tree of nodes. To find the name of an element, use an XPath expression to find the element, then use the name() function to get its name.
If you want to find an element whose name matches a regex, use //*[matches(name(), $regex)]
The word boundary code \b is not supported in XQuery (see https://www.w3.org/TR/xpath-functions-31/#regex-syntax).
But I guess you are looking for Hyperlink elements, not for a <Hyperlink> substring, so you should use a path expression:
let $doc := fn:doc($uri)
where $doc//Hyperlink[matches(., '([A-Z][a-z]{2} [0-3]?[0-9] [12][890][0-9]{2})')]
return $doc

xpath search from importxml function in google sheets

how do I get "Div/yield" value from here? i've tried //td[node()='Div/yield' and //td[text()='Div/yield'.
and //td[#data-snapfield='latest_dividend-dividend_yield']/following-sibling::td
#sideshowbarker is correct in that there's a newline at the end so looking for an element with the exact text would return 0 results. Another way to do this (one is through #sideshowbarker's answer) is to look for an element that contains this text. So the first step is:
//td[contains(text(),'Div/yield')]
But you don't need this. Your last answer is on the right track. You've identified the element that you're after, but I think you're looking for the text. So you need to add text() at the end:
//td[#data-snapfield='latest_dividend-dividend_yield']/following-sibling::td/text()
But if you want to use the field name, so you could use the xpath for the other fields as well, then just combine these:
//td[contains(text(),'Field name')]/following-sibling::td/text()
Now just replace Field name with the field you're after..
e.g. 'Div/yield': //td[contains(text(),'Div/yield')]/following-sibling::td/text()

I am trying to use XPath function contains() that has a string in 2 parts but it is throwing an invalid xpath error

I am trying to use XPath function contains() that has a string in 2 parts but it is throwing an "invalid xpath expression" error upon evaluation.
Here is what I am trying to achieve:
Normal working xpath:
//*[contains(text(),'some_text')]
Now I want to break it up in 2 parts as some random text is populating in between:
//*[contains(text(),'some'+ +'text')]
What I have done is to use '+' '+' to concatenate string in expression as we do in Java. Please suggest how can i get through this.
You can combine 2 contains() in one predicate expression to check if a text node contains 2 specific substrings :
//*[text()[contains(.,'some') and contains(.,'text')]]
demo
If you need to be more specific by making sure that 'text' comes somewhere after 'some' in the text node, then you can use combination of substring-after() and contains() as shown below :
//*[text()[contains(substring-after(.,'some'),'text')]]
demo
If each target elements always contains one text node, or if only the first text node need to be considered in case multiple text nodes found in an element, then the above XPath can be simplified a bit as follow :
//*[contains(substring-after(text(),'some'),'text')]

How to write Xpath expressions to distinguish between results?

I am new to xpath expression. Need help on a issue
Consider the following Document :
<tbody><tr>
<td>By <strong>Bec</strong></td>
<td><strong>Great Support</strong></td>
</tr></tbody>
In this I have to find the text inside tags separately.
Following is my xpath expression:
//tbody//td//strong/text();
It evaluates output as expected:
Bec
Great Support
How can I write xpath expressions to distinguish between the results i.e Becand Great Support
It's rather unclear what you're trying to do, but the following should succeed in selecting them separately:
//tbody/tr/td[1]/strong
and
//tbody/tr/td[2]/strong
Note that the text() you had at the end is most likely not needed in this case.
Not sure I understand 100%, but if you're trying to get the text of the first and the second strong tags, you can use position (1 based index)
//tbody/td[position()=1]/strong/text() //first text
//tbody/td[position()=2]/strong/text() //second text
This solution only applies to the current sample though, where your strong tags are inside either the first or second td tag.
Not sure this is what you're looking for... anyway, assuming you're asking to retrieve a node based on its text you can look up for text content by doing something like:
//tbody//td//strong/text()[.="Bec"]
PS
in [.=""] the dot is an alias for text() self::node() (thanks JLRishe for pointing out the mistake).

Trouble using Xpath "starts with" to parse xhtml

I'm trying to parse a webpage to get posts from a forum.
The start of each message starts with the following format
<div id="post_message_somenumber">
and I only want to get the first one
I tried xpath='//div[starts-with(#id, '"post_message_')]' in yql without success
I'm still learning this, anyone have suggestions
I think I have a solution that does not require dealing with namespaces.
Here is one that selects all matching div's:
//div[#id[starts-with(.,"post_message")]]
But you said you wanted just the "first one" (I assume you mean the first "hit" in the whole page?). Here is a slight modification that selects just the first matching result:
(//div[#id[starts-with(.,"post_message")]])[1]
These use the dot to represent the id's value within the starts-with() function. You may have to escape special characters in your language.
It works great for me in PowerShell:
# Load a sample xml document
$xml = [xml]'<root><div id="post_message_somenumber"/><div id="not_post_message"/><div id="post_message_somenumber2"/></root>'
# Run the xpath selection of all matching div's
$xml.selectnodes('//div[#id[starts-with(.,"post_message")]]')
Result:
id
--
post_message_somenumber
post_message_somenumber2
Or, for just the first match:
# Run the xpath selection of the first matching div
$xml.selectnodes('(//div[#id[starts-with(.,"post_message")]])[1]')
Result:
id
--
post_message_somenumber
I tried xpath='//div[starts-with(#id,
'"post_message_')]' in yql without
success I'm still learning this,
anyone have suggestions
If the problem isn't due to the many nested apostrophes and the unclosed double-quote, then the most likely cause (we can only guess without being shown the XML document) is that a default namespace is used.
Specifying names of elements that are in a default namespace is the most FAQ in XPath. If you search for "XPath default namespace" in SO or on the internet, you'll find many sources with the correct solution.
Generally, a special method must be called that binds a prefix (say "x:") to the default namespace. Then, in the XPath expression every element name "someName" must be replaced by "x:someName.
Here is a good answer how to do this in C#.
Read the documentation of your language/xpath-engine how something similar should be done in your specific environment.
#FindBy(xpath = "//div[starts-with(#id,'expiredUserDetails') and contains(text(), 'Details')]")
private WebElementFacade ListOfExpiredUsersDetails;
This one gives a list of all elements on the page that share an ID of expiredUserDetails and also contains the text or the element Details

Resources