Basically, the question is how to convert and and not operators correctly in the given xpath:
"//div[contains(#class, 'cpthead') or contains(#class, 'cptbulkhead') and (not(contains(#class, 'overloaded') or contains(#class, 'loaded')))]"
the answer is :
.cpthead:not(.overloaded):not(.loaded),cptbulkhead:not(.overloaded):not(.loaded)
For the or you can use a comma, like this:
div.cpthead.cptbulkhead.loaded:not(.overloaded),
div.cpthead.cpthead.loaded:not(.overloaded)
PS: It's a reply of har07's answer but I can't post comments...
You can concatenate multiple class selectors in CSS to translate your XPath and, and then use pseudo-class :not() to translate XPath not(), something like this :
div.cpthead.cptbulkhead.loaded:not(.overloaded)
Related
I have this query //*[#id="test"]/div/[not(contains(.,'/explore'))]
I want to add a second 'not contains' command to this:
//*[#id="test"]/div/[not(contains(.,'/locations'))]
And maybe even a 3rd one. Does anyone know how to do this?
None of what you posted is a valid XPath expression. If you meant to filter the div element so that only div that doesn't contain certain string, say "/explore", is returned, you can do this way instead :
//*[#id="test"]/div[not(contains(.,'/explore'))]
and another XPath example that check if the div doesn't contain any of 2 strings, "/explore" and "/locations" :
//*[#id="test"]/div[not(contains(.,'/explore')) and not(contains(.,'/locations'))]
The text is as:
text1text2
How can I specify this text in xpath. I tried:
.//*[#id='someid']//h6[text() ='text1text2]
.//*[#id='someid']//h6[text() ='text1\ntext2]
.//*[#id='someid']//h6[text() ='text1 text2]
None of them worked
Use .//*[#id='someid']//h6[. = 'text1
text2']. This assumes you are writing the path inside of XSLT or XForms where you can use
to escape a new line character. If you are not using XSLT you might want to tell us in which host language (e.g. PHP, C#, Java) you use XPath.
not very elegant but it works
.//*[#id='someid']//h6[contains(text(), 'text1') and contains(text(), 'text2')]
You can use normalize-space() to remove the line feed and compare text without this issue.
//*[#id='someid']//h6[normalize-space(text()) ='text1 text2']
This is the working code
.//*[#id='someid']//h6[. = 'text1text2']
Thank you.
I am new to xpath expression. Need help on a issue
Consider the following Document :
<tbody><tr>
<td>By <strong>Bec</strong></td>
<td><strong>Great Support</strong></td>
</tr></tbody>
In this I have to find the text inside tags separately.
Following is my xpath expression:
//tbody//td//strong/text();
It evaluates output as expected:
Bec
Great Support
How can I write xpath expressions to distinguish between the results i.e Becand Great Support
It's rather unclear what you're trying to do, but the following should succeed in selecting them separately:
//tbody/tr/td[1]/strong
and
//tbody/tr/td[2]/strong
Note that the text() you had at the end is most likely not needed in this case.
Not sure I understand 100%, but if you're trying to get the text of the first and the second strong tags, you can use position (1 based index)
//tbody/td[position()=1]/strong/text() //first text
//tbody/td[position()=2]/strong/text() //second text
This solution only applies to the current sample though, where your strong tags are inside either the first or second td tag.
Not sure this is what you're looking for... anyway, assuming you're asking to retrieve a node based on its text you can look up for text content by doing something like:
//tbody//td//strong/text()[.="Bec"]
PS
in [.=""] the dot is an alias for text() self::node() (thanks JLRishe for pointing out the mistake).
I want to write xpath to check node contain '#'
<node1>
<node11>Some text</node11>
<node11>#2o11 PickMe</node12>
</node1>
I want to write xpath like "//node11[contains(,'#\d+')]". Whats correct way to check #
The correct XPath expression is:
//node11[contains(., '#')]
In your XML, the closing tag of the second subnote should be </node11> instead of </node12>.
If you are using xpath 2.0 you should be able to use something like:
"//node11[matches(.,'#\d+')]"
However, if you aren't using 2.0 you won't have regex support directly. If you are using 1.0 then you won't be able to match using \d+. But this will work:
"//node11[contains(.,'#')]"
Or even:
"//node11[starts-with(.,'#')]"
Use:
/*/node11[contains(., '#')]
Note: It is recommended to avoid using the // pseudo-operator because this most often leads to very slow evaluation of the XPath expression.
I want to extract "Date: 2009-09-25, 1:54PM EDT" from this webpage
http://auburn.craigslist.org/sha/1392067187.html
But I don't understand how to write Xpath expressions for that.
Can anyone help me in that.
I am getting other fields also from this page.
Why don't you just run a regexp like the one below?
'Date:\s+([0-9]{4}-[0-9]{2}-[0-9]{2}.+?\<)'
It seams to be the easiest way. And if you don't want to use pure text you can use XPath 2.0 which has support for regexps (fn:matches).
Are you running the HTML through TIDY or some other process to turn it into XHTML? Or how are you able to execute XPATH against that HTML?
If the document was well-formed, then you could probably use the following XPATH:
/html/body/hr[1]/following-sibling::text()[1]
It finds the first HR element in the document, then selects the first text() node following it(which contains the string "Date: 2009-09-25, 1:54PM EDT"