How can I find elements that has at least one attribute?
Example:
<tr>...</tr>
<tr style="">...</tr>
<tr width="">...</tr>
I want all tr elements but ...
I tried following xpath but it doesn't work.
//table//tr[contains(attributes::*,'')]
Thanks
This should do it:
//table/tr[#*]
The reason why yours doesn't work is because contains() will always return true when the second parameter is ''. When an expression returns a node set within square brackets, it is considered true if it's non-empty, false if it's empty. So [#*] will return the set of all attributes and will be interpreted as true if there are any present.
Related
Given this XML, what XPath returns all elements whose prop attribute contains Foo (the first three nodes):
<bla>
<a prop="Foo1"/>
<a prop="Foo2"/>
<a prop="3Foo"/>
<a prop="Bar"/>
</bla>
//a[contains(#prop,'Foo')]
Works if I use this XML to get results back.
<bla>
<a prop="Foo1">a</a>
<a prop="Foo2">b</a>
<a prop="3Foo">c</a>
<a prop="Bar">a</a>
</bla>
Edit:
Another thing to note is that while the XPath above will return the correct answer for that particular xml, if you want to guarantee you only get the "a" elements in element "bla", you should as others have mentioned also use
/bla/a[contains(#prop,'Foo')]
This will search you all "a" elements in your entire xml document, regardless of being nested in a "blah" element
//a[contains(#prop,'Foo')]
I added this for the sake of thoroughness and in the spirit of stackoverflow. :)
This XPath will give you all nodes that have attributes containing 'Foo' regardless of node name or attribute name:
//attribute::*[contains(., 'Foo')]/..
Of course, if you're more interested in the contents of the attribute themselves, and not necessarily their parent node, just drop the /..
//attribute::*[contains(., 'Foo')]
descendant-or-self::*[contains(#prop,'Foo')]
Or:
/bla/a[contains(#prop,'Foo')]
Or:
/bla/a[position() <= 3]
Dissected:
descendant-or-self::
The Axis - search through every node underneath and the node itself. It is often better to say this than //. I have encountered some implementations where // means anywhere (decendant or self of the root node). The other use the default axis.
* or /bla/a
The Tag - a wildcard match, and /bla/a is an absolute path.
[contains(#prop,'Foo')] or [position() <= 3]
The condition within [ ]. #prop is shorthand for attribute::prop, as attribute is another search axis. Alternatively you can select the first 3 by using the position() function.
Have you tried something like:
//a[contains(#prop, "Foo")]
I've never used the contains function before but suspect that it should work as advertised...
John C is the closest, but XPath is case sensitive, so the correct XPath would be:
/bla/a[contains(#prop, 'Foo')]
If you also need to match the content of the link itself, use text():
//a[contains(#href,"/some_link")][text()="Click here"]
/bla/a[contains(#prop, "foo")]
try this:
//a[contains(#prop,'foo')]
that should work for any "a" tags in the document
For the code above...
//*[contains(#prop,'foo')]
I met a problem when I wanted to get style attribute of element.
$styleValue = $this->getAttribute("//ul#style");
But result of var_dump($styleValue) is
string(1) ";"
But I expected "margin-left: -2432px;"
So, where am I wrong? How can I get style attribute of element?
Your XPath expression is wrong. If you want to read the #style attribute of the <ul/> tag, you have to use two step expressions: stepping into the list, then into the attribute. Each must be seperated using a /, the # only denominates an attribute.
//ul/#style
I have the plenty of links like this:
<b>Edit issue >></b>
Trying to extract the href' content I use Xpath expression:
//a[contains(#href,'/edit_flat')]
but it returns me null. What am I doing wrong ?
//a[contains(#href,'/edit_flat')] selects a elements anywhere in the document tree that have an href attribute containing the '/edit_flat' string.
These matching elements do have this very "href" attribute, but the XPath expression you are using returns "only" the a elements, if there are any.
To actually return the matching elements' attribute's values, you need an extra step, with / and #href. So what you want is:
//a[contains(#href,'/edit_flat')]/#href
Suggestion:
What you really want is probably to select links which href begin with the substring "/edit_flat", so it's safer to use:
.//a[starts-with(#href,'/edit_flat')]/#href
I have the following XPATH that selects elements containing certain strings ("video" or "color" or "black and white"). The issue I am having is that one of the elements that is selected contains a string "video reprints" and although it's correct, I do not want this particular element selected. I thought I could specify NOT in the XPATH as in the following...
//div/A[contains(., 'video') or contains(., 'color') or contains(., 'black and white') and (not (contains(., 'reprint')))]
Any thoughts on how I can remove any selection that contains the string "reprints" from the selections above?
This is a precedence issue. Just wrap all the or-ed conditions into parentheses:
[( ... or ... or ...) and (not(...))]
Really it's just because of the way you have your parentheses, so this will work:
//div/A[(contains(., 'video') or contains(., 'color') or contains(., 'black and white')) and not(contains(., 'reprints'))]
I'd like to use Nokogiri to extract all nodes in an element that contain a specific attribute name.
e.g., I'd like to find the 2 nodes that contain the attribute "blah" in the document below.
#doc = Nokogiri::HTML::DocumentFragment.parse <<-EOHTML
<body>
<h1 blah="afadf">Three's Company</h1>
<div>A love triangle.</div>
<b blah="adfadf">test test test</b>
</body>
EOHTML
I found this suggestion (below) at this website: http://snippets.dzone.com/posts/show/7994, but it doesn't return the 2 nodes in the example above. It returns an empty array.
# get elements with attribute:
elements = #doc.xpath("//*[#*[blah]]")
Thoughts on how to do this?
Thanks!
I found this here
elements = #doc.xpath("//*[#*[blah]]")
This is not a useful XPath expression. It says to give you all elements that have attributes that have child elements named 'blah'. And since attributes can't have child elements, this XPath will never return anything.
The DZone snippet is confusing in that when they say
elements = #doc.xpath("//*[#*[attribute_name]]")
the inner square brackets are not literal... they're there to indicate that you put in the attribute name. Whereas the outer square brackets are literal. :-p
They also have an extra * in there, after the #.
What you want is
elements = #doc.xpath("//*[#blah]")
This will give you all the elements that have an attribute named 'blah'.
You can use CSS selectors:
elements = #doc.css "[blah]"