XPath text() expression - what to specify in text() if it contains newline - xpath

The text is as:
text1text2
How can I specify this text in xpath. I tried:
.//*[#id='someid']//h6[text() ='text1text2]
.//*[#id='someid']//h6[text() ='text1\ntext2]
.//*[#id='someid']//h6[text() ='text1 text2]
None of them worked

Use .//*[#id='someid']//h6[. = 'text1
text2']. This assumes you are writing the path inside of XSLT or XForms where you can use
to escape a new line character. If you are not using XSLT you might want to tell us in which host language (e.g. PHP, C#, Java) you use XPath.

not very elegant but it works
.//*[#id='someid']//h6[contains(text(), 'text1') and contains(text(), 'text2')]

You can use normalize-space() to remove the line feed and compare text without this issue.
//*[#id='someid']//h6[normalize-space(text()) ='text1 text2']

This is the working code
.//*[#id='someid']//h6[. = 'text1text2']
Thank you.

Related

Escape single quote in Xtend template expression

I have a very simple question, but could not figure it out by Google search, please help.
I want to produce this string '\u0000' (note the simple quote marks surrounding it!) using the following simple Xtend method containing a template expression:
def String makeDefaultChar()
{
''''\u0000''''
}
However, this is not accepted as proper syntax (probably because of the four ''''. Is there an escape character for this use case or what is the right syntax?
Thank you in advance!
P.S.
Of course I could use plain Java string like this "'\\u0000'" to achieve the same, but I want to use an Xtend template expression.
My Xtend version is: 2.9.1.v201512180746
There is no "escaping" in template expressions, so you have to use the workaround you mentioned:
'''«"'\\u0000'"»'''
or
'''«"'"»\u0000«"'"»'''
Related discussion: https://groups.google.com/forum/#!topic/xtend-lang/bVZ0nKmQGAI
Single quotes are allowed within Xtend templates as long as they do not occur at the beginning or the end of the template. So a simple workaround is to add an empty expression before/after the single quote:
'''«»'\u0000'«»'''

How to write Xpath expressions to distinguish between results?

I am new to xpath expression. Need help on a issue
Consider the following Document :
<tbody><tr>
<td>By <strong>Bec</strong></td>
<td><strong>Great Support</strong></td>
</tr></tbody>
In this I have to find the text inside tags separately.
Following is my xpath expression:
//tbody//td//strong/text();
It evaluates output as expected:
Bec
Great Support
How can I write xpath expressions to distinguish between the results i.e Becand Great Support
It's rather unclear what you're trying to do, but the following should succeed in selecting them separately:
//tbody/tr/td[1]/strong
and
//tbody/tr/td[2]/strong
Note that the text() you had at the end is most likely not needed in this case.
Not sure I understand 100%, but if you're trying to get the text of the first and the second strong tags, you can use position (1 based index)
//tbody/td[position()=1]/strong/text() //first text
//tbody/td[position()=2]/strong/text() //second text
This solution only applies to the current sample though, where your strong tags are inside either the first or second td tag.
Not sure this is what you're looking for... anyway, assuming you're asking to retrieve a node based on its text you can look up for text content by doing something like:
//tbody//td//strong/text()[.="Bec"]
PS
in [.=""] the dot is an alias for text() self::node() (thanks JLRishe for pointing out the mistake).

xpath to check '#' present

I want to write xpath to check node contain '#'
<node1>
<node11>Some text</node11>
<node11>#2o11 PickMe</node12>
</node1>
I want to write xpath like "//node11[contains(,'#\d+')]". Whats correct way to check #
The correct XPath expression is:
//node11[contains(., '#')]
In your XML, the closing tag of the second subnote should be </node11> instead of </node12>.
If you are using xpath 2.0 you should be able to use something like:
"//node11[matches(.,'#\d+')]"
However, if you aren't using 2.0 you won't have regex support directly. If you are using 1.0 then you won't be able to match using \d+. But this will work:
"//node11[contains(.,'#')]"
Or even:
"//node11[starts-with(.,'#')]"
Use:
/*/node11[contains(., '#')]
Note: It is recommended to avoid using the // pseudo-operator because this most often leads to very slow evaluation of the XPath expression.

How to change this Ruby regex to also include underlines?

At the moment I am using this line to take a string and extract only the letters from it:
string.scan(/[a-zA-Z]/).to_s
How do I modify this so that the underline character, "_", is also included? Thanks for reading.
Add it within the brackets (the range IIRC).
string.scan(/[a-zA-Z_]/).to_s
Alternative version
string.scan(/[a-z_]/i).to_s

XPath expression?

I want to extract "Date: 2009-09-25, 1:54PM EDT" from this webpage
http://auburn.craigslist.org/sha/1392067187.html
But I don't understand how to write Xpath expressions for that.
Can anyone help me in that.
I am getting other fields also from this page.
Why don't you just run a regexp like the one below?
'Date:\s+([0-9]{4}-[0-9]{2}-[0-9]{2}.+?\<)'
It seams to be the easiest way. And if you don't want to use pure text you can use XPath 2.0 which has support for regexps (fn:matches).
Are you running the HTML through TIDY or some other process to turn it into XHTML? Or how are you able to execute XPATH against that HTML?
If the document was well-formed, then you could probably use the following XPATH:
/html/body/hr[1]/following-sibling::text()[1]
It finds the first HR element in the document, then selects the first text() node following it(which contains the string "Date: 2009-09-25, 1:54PM EDT"

Resources