How to get the preceding element? - xpath

<p class="small" style="margin: 16px 4px 8px;">
<b>
<a class="menu-root" href="#pg-jump">Pages</a>
:
<b>1</b>
,
<a class="pg" href="viewforum.php?f=941&start=50">2</a>
,
<a class="pg" href="viewforum.php?f=941&start=100">3</a>
...
<a class="pg" href="viewforum.php?f=941&start=8400">169</a>
,
<a class="pg" href="viewforum.php?f=941&start=8450">170</a>
,
<a class="pg" href="viewforum.php?f=941&start=8500">171</a>
<a class="pg" href="viewforum.php?f=941&start=50">Next.</a>
</b>
</p>
I want to catch a element containing 171. So basically the preceding element from the Next.
//a[.='Next.']//Not sure how to use preceding here

You can use this xpath:
//a[.="Next."]/preceding::a[1]
If I were to diagram it out, using an X to represent the current location, it would look like this:
------------------+------+------------------
preceding-sibling | self | following-sibling
------------------|------|------------------
last() ... 2 1 | X | 1 2 ... last()
------------------+------+------------------

//a[contains(text(), 'Next.')]/preceding::a[contains(text(), '171')]
Explanation of xpath: Using text method along with <a> tag and then move ahead with preceding keyword to locate the element 171

I know this is old and if you didn't know the containing element preceding the "Name." element this wouldn't be a solution for you. BUT, if you were wanting to find exactly that element and there are several "171" elements all over the page.
The way to distinguish it from the rest, you could use the following.
//p[b[contains(., 'Next.')]]//a[contains(., '171')]

Related

How Can I get text of all child nodes

I want to get the text of a tag and span tag at the same time.
<td class="desc_autoHeight">
<a rel="nofollow" href="#" target="_blank">Silicon Power</a>
<br><span class="FreeGift">48 Hours Only</span>
</td>
<td class="desc_autoHeight">
<a rel="nofollow" href="#" target="_blank">Silicon Power</a>
48 Hours Only
</td>
Result should be Silicon Power 48 Hours Only
Here is the xpath with concat
concat(//td[#class='desc_autoHeight']/a/text(), ' ', //td[#class='desc_autoHeight']//span[#class='FreeGift']/text())
Screenshot:
If $td is the td element in question, then normalize-space($td) gives you the string you are looking for, at least in this particular example.
In many cases simply using the string value of the element (which you can get using string(), but in many cases that's implicit) is adequate. The difference with normalize-space() is that it turns chunks of spaces and newlines (like the whitespace before the <a> start tag) into a single space, or eliminates it at the start and end.

Xpath identifier for an span element with an 'i' tag inside it

Trying to find out xpath for the element for the html code mentioned below.
Xpath works with
//span[#title='Open']
But not with
//span[text()='Open']
Trying to find out xpath with span and text. How this could be done
<span class="m-t-5">
<span class="label statusopen" title="Open" onclick="javascript:toggleHistoryTab(this,'tw_831158485664530432','2302','SOLR','-1','s360-379359269');" style="cursor: pointer; text-decoration: none;">
<i class="fa fa-envelope-open-o"/>
Open
</span>
</span>
In my test case, more appropriate one is to find with text "Open".
This XPath,
//span[normalize-space()='Open'][not(.//span)]
will select those span elements whose normalized string value is "Open" and will exclude parent span elements such as the one in your example with class="m-t-5".
If you need XPath for span element with an 'i' tag inside it you may try:
//span[i]
If you need even more specific XPath for span element with an 'i' tag inside it that contains text "Open":
//span[i[normalize-space(text())="Open"]]

xpath translate two expressions

I can't figure out two expressions in xpath. Can someone help ?
Here they are
substring-after(substring-before(//ul[#id='biblio']/li[3], ']', '['))
//h2[normalize-space(string())='name']/preceding::h1[1]
Your first expression:
substring-after(substring-before(//ul[#id='biblio']/li[3], ']', '['))
First this may find all ul elements which are at (self) or a descendant of the context of your XPath. These must have an id attribute with the value 'biblio' to me matched, from there it will find the 3rd li child element(s) from the matching ul element(s).
It will then perform the substring functions on the text() of the li element(s) after atmomizing them to a string.
So for example if the text of a matched li element was hello [world]. You would end up with just world as the result. As a more complete example, given the XML input:
<div>
<ul id="biblio">
<li>thing [one]</li>
<li>thing [two]</li>
<li>thing [three]</li>
</ul>
<ul id="biblio">
<li>other [a]</li>
<li>other [b]</li>
<li>other [c]</li>
</ul>
</div>
You would get a sequence of two strings as the result of your XPath expression which would be three and c. Note that the use of <div> in the example input is just a container and could be any element.
Your second expression:
//h2[normalize-space(string())='name']/preceding::h1[1]
First this may find all the h2 elements which are at (self) or a descendant of the context of your XPath. These must have a text() that when atmomised to a string is equal to name. From there you then select the 1st preceding h1.
So for example, given the XML input:
<div>
<h1>title1</h1>
<p>stuff</p>
<h1>title2</h1>
<p>more stuff</p>
<h2>name</h2>
<p>other stuff</p>
</div>
You would get the following XML output as a result of your XPath expression:
<h1>title2</h1>
Hope that helps you understand...

Xpath first occurrence of a tree

I want to find the first occurrence of a tree. Example:
<div id='post>
<p>text1</p>
<p>text2</p>
<img src="a.jpg">
<img src="b.jpg">
<p>text3</p>
<p>text4</p>
<img src="c.jpg">
<p>text5</p>
</div>
I want to find the first occurrence of "p/img/#src".
When i do xpath search: .//div/p/img[1]/#src
it gives 2 hits, a.jpg and c.jpg
What is the xpath for only the first occurrence (a.jpg).
I would say .//div/(p/img)[1]/#src but is of course not working.
The best option would be:
(//img[#src])[1]/#src
or
(//p//img[#src])[1]/#src
ensuring img itself within a p element.
As Martin says img is not a child of p. Moreover in your example are missing single quote closing of id attribute inside div and tag closing of img.
Here your xml corrected:
<div id='post'>
<p>text1</p>
<p>text2</p>
<img src="a.jpg"/>
<img src="b.jpg"/>
<p>text3</p>
<p>text4</p>
<img src="c.jpg"/>
<p>text5</p>
</div>
Now to select the first image you can use simply //img[1]/#src or //img[#src="a.jpg"]

xpath for locating li with text does not work

Using the xpath //ul//li[contains(text(),"outer")] to find a li in the outer ul does not work
<ul>
<li>
<span> not unique text, </span>
<span> not unique text, </span>
outer ul li 1
<ul >
<li> inner ul li 1 </li>
<li> inner ul li 2 </li>
</ul>
</li>
<li>
<span> not unique text, </span>
<span> not unique text, </span>
outer ul li 2
<ul >
<li> inner ul li 1 </li>
<li> inner ul li 2 </li>
</ul>
</li>
</ul>
Any idea how to find a li with a specific text in the outer ul?
Thank you
This will work for you //ul//li[contains(.,"outer")]
I would expect that you only like to consider the text nodes which are direct child of the li. Therefore you are right with using text() (if you use contains(.,"outer") this will consider text form any children of li).
Therefore try this:
//ul/li[text()[contains(.,'outer')]]
Running this with Saxon, the original XPath expression gives:
XPTY0004: A sequence of more than one item is not allowed as the first argument of
contains() ("", "", ...)
Now, I guess Selenium is probably using XPath 1.0 rather than XPath 2.0, and in 1.0 the contains() function has "first item semantics" - it converts its argument to a string, which if the argument is a node-set containing more than one node, involves considering only the first node. And the first text node is probably whitespace.
If you want to test whether some child text node contains "outer", use
//ul//li[text()[contains(.,"outer")]]
Another reason for switching to XPath 2.0...
For above issue -
This solution will work
//ul//li[contains(.,"outer")]
"." Selects the current node

Resources