I know there are tousands of simple question about xpath but i dont get it how to combine two not too simple expressions...
My xml structure:
<div class="some-container">
<div class="btn btn-blue">
<div class="btn-text"><!-- Select by class -->
<span> <!-- Select by text-->
Download
</span>
</div>
</div>
</div>
Select by class
I know achieved to select the div by searching after the class:
//*/div[contains(concat(' ', #class, ' '), ' btn-text')]
Select by text
To select also the span i know i can simply add /span but then i want to select by text.
For that usecase i got the xpath (form here):
//*/text()[normalize-space(.)='Download']/parent::*
Those selector both are working properly but i want to combine them
Search for class "btn"
Search for text inside span that exactly matches
I tried to concat like that but that dont work test-example:
//div[contains(concat(' ', #class, ' '), ' btn-text')]/text()[normalize-space(.)='Download']/parent::*
even if it'd work there is no selecting by span tag
Anyone who could help?
Your XPath is looking for a text node directly inside the div, but the text node you're looking for is inside a span. That's why it's not succeeding.
To get it to work, just change the XPath to look for the span and not the text node:
//div[contains(concat(' ', #class, ' '), ' btn-text ')]/span[normalize-space(.)='Download']
Related
<div class="season-rate season-summer">
<p class="heading">Summer</p>
<p class="subHeading">from</p>
<p class="price">€180,000<span>p/week + expenses</span><span class="approx">Approx
$211,500</span></p>
</div>
I am trying to grab the price here (€180,000) based on that the heading class is "Summer":
//p[contains(.,'Summer')]/following-sibling::p[2]
This returns:
€180,000p/week + expensesApprox
$211,500
But I only want:
€180,000
So I want to stop the XPATH before this next span class:
<span class="approx">Approx
$211,500</span>
I am trying variations of this without any luck!
//p[contains(.,'Summer')]/following-sibling::p[2] [not(preceding-sibling::span[contains(.,'p/week')])]
You can try this expression to get price only
//p[.="Summer"]/following-sibling::p[#class="price"]/text()
I think this should do it:
//div[p["Summer"]]/p[#class="price"]/text()[not(self="span")]
or even simpler:
//div[p["Summer"]]/p[#class="price"]/text()[not(span)]
I have an application that requires me to find a XPath selector for an element and then see if that XPath can be simplified.
So if I have
<a class="abc def gh">
I may determine that the XPath
a[contains(#class, "abc")
is specific enough. The problem is, it also selects items with class "abcxyz",
Is there a way to select items with ONLY class "abc"?
i.e. I think it's clear but I want to find items that have a class of "abc" or "abc def" but not "abcxyz".
Here's a more specific example because I believe neither of the answers so far works:
<div>
<span id="x" class="btnSalePriceLabel">Sale:</span>
<span id="y" class="btnSalePrice highlight">$20.40</span>
</div>
I want whatever XPath selector will select the 2nd span and not the first.
If I try
//span[#class and contains(concat(' ', normalize-space(#class), ' '), ' btnSalesPrice ')]
I get nothing selected. Likewise with
//span[contains(concat(' ', normalize-space(#class), ' '), ' btnSalesPrice ')]
Since class attribute is a multi-valued attribute, you have to account for these spaces between the values with concat():
//a[contains(concat(' ', normalize-space(#class), ' '), ' abc ')]
Note that CSS selectors have this ability to match specific class values built-in:
a.abc
I think you can see what is more concise and readable.
it is better if you use css for this exact matches, specially with class attributes, in which case it would be:
a.abc
You can use different css-to-xpath converters on several languages (check this one for example on javascript) and its transformation would be:
descendant-or-self::a[#class and contains(concat(' ', normalize-space(#class), ' '), ' abc ')]
I'd like to use xquery (I believe) to output the text from the title attribute of an html element.
Example:
<div class="rating" title="1.0 stars">...</div>
I can use xpath to select the element, but it tries to output the info between the div tags. I think I need to use xquery to output the "1.0 stars" text from the title attribute.
There's gotta be a way to do this. My Google skills are proving ineffective in coming up with an answer.
Thanks.
XPath: //div[#class='rating']/#title
This will give you the title text for every div with a class of "rating".
Addendum (following from comments below):
If the class has other, additional text in it, in addition to "rating", then you can use something like this:
//div[contains(concat(' ', normalize-space(#class), ' '), ' rating ')]
(Hat tip to How can I match on an attribute that contains a certain string?).
You should use:
let $XML := <p><div class="rating" title="2.0 stars">sdfd</div><div class="rating" title="1.0 stars">sdfd</div></p>
for $title in $XML//#title
return
<p>{data($title)}</p>
to get output:
<p>2.0 stars</p>
<p>1.0 stars</p>
I am doing some screen scraping with a library that takes XPath expressions and noticed that several pages are similar, but different.
Is there a way to loosely say "get me divs that have class='mytarget' but exist as a child of a div with class = 'nav' and the exact path is unknown between nav and mytarget."
<div class="nav">
<div>
??????
<div class="mytarget"></div>
??????
</div>
</div>
Yes, using the descendant-or-self axis (//):
//div[#class='nav']//div[#class='mytarget']
Or, if there can be more than one class name on those elements, then this is even better:
//div[contains(concat(' ', #class, ' '), ' nav ')]//
div[contains(concat(' ', #class, ' '), ' mytarget ')]
Warning: this can be very inefficient on large documents. You should use absolute paths wherever the structure is known. Only resort to // when the structure is unknown.
Thats what "//" expressions are for. Something like:
//*[#class="nav"]//*[#class="mytarget"]
http://www.w3schools.com/xpath/xpath_syntax.asp
I know how to get a list of DIVs of the same css class e.g
<div class="class1">1</div>
<div class="class1">2</div>
using xpath //div[#class='class1']
But how if a div have multiple classes, e.g
<div class="class1 class2">1</div>
What will the xpath like then?
The expression you're looking for is:
//div[contains(#class, 'class1') and contains(#class, 'class2')]
I highly suggest XPath visualizer, which can help you debug xpath expressions easily. It can be found here:
http://xpathvisualizer.codeplex.com/
According to this answer, which explains why it is important to make sure substrings of the class name that one is looking for are not included, the correct answer should be:
//div[contains(concat(' ', normalize-space(#class), ' '), ' class1 ')
and contains(concat(' ', normalize-space(#class), ' '), ' class2 ')]
There's a useful python package called cssselect.
from cssselect import CSSSelector
CSSSelector('div.gallery').path
Generates a usable XPath:
descendant-or-self::div[#class and contains(concat(' ', normalize-space(#class), ' '), ' gallery ')]
It's very similar to Flynn1179's answer.
i think this the expression you're looking for is
//div[starts-with(#class, "class1")]/text()
You could also do:
//div[contains-token(#class, 'class_one') and contains-token(#class, 'class_two')]