XPath expression:
'.//div/concat(#id, " - ", #class)'
fails with an error:
The expression is not a legal expression.
in Firefox 25.0 (from a userscript).
Why, and how to fix?
For input:
<div id='id1' class='class1'>
sample
</div>
<div id='id2' class='class2'>
sample
</div>
I'd like to get two separate strings:
id1 - class1
id2 - class2
Firefox only supports XPath 1.0, but your expression requires XPath 2.0. There's no equivalent in XPath 1.0 (your expression returns a sequence of strings, which is a data type that doesn't exist in XPath 1.0).
Are you calling this XPath from XSLT or from Javascript? Either way, you will have to do the work in the host language rather than in XPath.
I think you should do as below:
HTML
<div id='foo' class='bax'>
sample
</div>
XPATH
concat(//div/#id, " - ",//div/#class)
or
(//div)/concat(#id,' - ',#class)
output
foo - bax
Related
I have an HTML element I would like to select that looks like this:
<button data-button-id="close" class="modal__cross modal__cross-web"></button>
Now clearly I can use this XPath selector:
//button[(contains(#data-button-id,'close')) and (contains(#class,'modal'))]
to select the element. However, I would really like to select buttons that have both close and modal contained in any attributes. So I can generalize the selector and say:
//button[(contains(#*,'close')) and (contains(#class,'modal'))]
and that works. What I'd love to do is extend it to this:
//button[(contains(#*,'close')) and (contains(#*,'modal'))]
but that doesn't return any results. So clearly that doesn't mean what I'd like it to mean. Is there a way to do it correctly?
Thanks,
Craig
It looks like you're using XPath 1.0: in 1.0, if you supply a node-set as the first argument to contains(), it takes the first node in the node-set. The order of attributes is completely unpredictable, so there's no way of knowing whether contains(#*, 'close') will succeed or not. In 2.0+, this gives you an error.
In both 1.0 and 2.0, #*[contains(., 'close')] returns true if any attribute contains "close" as a substring.
This expression works:
//button[attribute::*[contains(.,"close")] and attribute::*[contains(.,"modal")]]
Given this html
<button data-button-id="close" class="modal__cross modal__cross-web"></button>
<button key="close" last="xyz_modal"></button>
Testing with xmllint
echo -e 'cat //button[attribute::*[contains(.,"close")] and attribute::*[contains(.,"modal")]]\nbye' | xmllint --html --shell test.html
/ > cat //button[attribute::*[contains(.,"close")] and attribute::*[contains(.,"modal")]]
-------
<button data-button-id="close" class="modal__cross modal__cross-web"></button>
-------
<button key="close" last="xyz_modal"></button>
/ > bye
Try this one to select required element:
//button[#*[contains(., 'close')] and #*[contains(., 'modal')]]
in this HTML using scrapy i can access the full info-car by : './/#info-car' XPath
<div class="car car-root"
info-car='{brand":"BMW","Price":"田"name":"X5","color":null,"}'>
</div>
what is the XPath to pick only the name of info-car ?
You can obtain the name by using a combination of xpath and regex. See below sample code:
response.xpath(".//#info-car").re_first(r'"name":"(.*)",')
I have the following code :
<div class = "content">
<table id="detailsTable">...</table>
<div class = "desc">
<p>Some text</p>
</div>
<p>Another text<p>
</div>
I want to select all the text within the 'content' class, which I would get using this xPath :
doc.xpath('string(//div[#class="content"])')
The problem is that it selects all the text including text within the 'table' tag. I need to exclude the 'table' from the xPath. How would I achieve that?
XPath 1.0 solutions :
substring-after(string(//div[#class="content"]),string(//div[#class="content"]/table))
Or just use concat :
concat(//table/following::p[1]," ",//table/following::p[2])
The XPath expression //div[#class="content"] selects the div element - nothing more and nothing less - and applying the string() function gives you the string value of the element, which is the concatenation of all its descendant text nodes.
Getting all the text except for that containing in one particular child is probably not possible in XPath 1.0. With XPath 2.0 it can be done as
string-join(//div[#class="content"]/(node() except table)//text(), '')
But for this kind of manipulation, you're really in the realm of transformation rather than pure selection, so you're stretching the limits of what XPath is designed for.
Hello I have this HTML:
<div class="_3Vhpd"><span>Your commerce Data</span>
<a class="n3G0C" href='http://www.webadress.......'><span>Some Text</span</a>
</div>
I tried to obtain the tag as follow:
parser.xpath('//div[contains(#class,"_3Vhpd")]//following-sibling::*[a[#class="n3G0C"]]/#href ')
but I received none '[]'. Maybe because is not just after div but after a span...
First, you sample html doesn't have a class="n3G0C", but assuming you fix it, this xpath expression should work:
//div[contains(#class,"_3Vhpd")]//following-sibling::a/#href
Output:
http://www.webadress.......
What is wrong with my path to select the following:
<label class="form-control-label" for="profile_form_state">State</label>
Xpath:
xpath = '//label[ends-with(#for, "_state")]'
I am using rspec and capybara
expect(rendered).to have_xpath(xpath)
Error:
xmlXPathCompOpEval: function ends-with not found
As answered by #har07, XPath 1.0 (which browsers implement) doesn’t have an ends-with but CSS provides an ends-with attribute selector $=
expect(rendered).to have_css(‘label[for$=“_state”]’)
or you can use regex with Capybaras built-in :label selector
expect(rendered).to have_selector(:label, for: /_state$/)
If you really want to stick with XPath over CSS then you can use the xpath gem Capybara uses internally for generating its own XPaths and write
xpath = XPath.descendant(:label).where(XPath.attr(:for).ends_with('_start'))
expect(rendered).to have_xpath(xpath)
Looks like your XPath processor only supports XPath, 1.0 while ends-with is defined in XPath 2.0 and above. But you can simulate ends-with() in XPath 1.0 using substring() and string-length() :
xpath = '//label["_state" = substring(#for, string-length(#for) - string-length("_state") +1)]'
You can shorten the expression a bit by replacing string-length("_state") +1 with pre-calculated value 5 (length of the word _state minus one) :
xpath = '//label["_state" = substring(#for, string-length(#for) - 5)]'
Try the following Xpath.
xpath = '//label[contains(., "State")]'
Other possible Xpath
//label[contains(#for, '_state')]
or
//label[text()='State']
or
//label[contains(text(), 'State')]