How to find xpath of a text element without node - xpath

<h1>
<span class="visually-hidden">BBC Radio</span>
Search results for 'archers'
</h1>
I want to locate the text element "Search results for 'archers'" . What will be the xpath that will locate to it and not to the element in span node ??

For your input sample
/h1/text()
Tested on http://videlibri.sourceforge.net/cgi-bin/xidelcgi
returns Search results for 'archers'

Related

replace full string in xpath just get before

I am searching a solution to remove a string value obtained on a webpage with an XPath function.
I have this :
<div id="article_body" class="">
This my wonderful sentence, however here the string i dont want :
<br><br>
<div class="typo">Found a typo in the article? Click here.
</div>
</div>
So at the end I would have
This my wonderful sentence, however here the string i dont want :
I get the text with
//*[#id="article_body"]
Then I try to use replace:
//replace('*[#id="article_body"]','Found a typo in the article? ', )
But it doesn't work, so I think it's because I'm a newbie with XPath...
How can I do that please?
It appears that you are getting the computed string value of the selected div element.
The string-value of an element node is the concatenation of the string-values of all text node descendants of the element node in document order.
If you don't want to include the text() from the descendant nodes, and only want the text() that are immediate children of the div, then adjust your XPath:
//*[#id="article_body"]/text()
Otherwise, you could use substring-before():
substring-before(//*[#id="article_body"], 'Found a typo in the article?')

XPath expression: selecting text nodes between element nodes

Based in the following HTML I want to extract TextA, TextC and TextE.
<div id='content'>
TextA
<br/>
<br/>
<p>TextB</p>
TextC
<br/>
TextC
<p>TextD</p>
TextE
</div>
I tried to get TextC like so but I don't get the result I want:
Query:
//*[preceding::p[contains(.,"TextB")] and following::p[contains(.,"TextD")]]
Expected result:
["TextC", <br/>, "TextC"]
Actual result:
[<br/>]
Is there a way to select the text nodes without using indexes like //div/text()[1]?
The reason why the two text nodes aren't in the result of your XPath is because * only match elements. To match both element and text node you can use node() instead :
//node()[preceding::p[contains(.,"TextB")] and following::p[contains(.,"TextD")]]
Demo
Or if you want to get the text nodes only i.e excluding <br/>, you can use text() instead of node():
//text()[preceding::p[contains(.,"TextB")] and following::p[contains(.,"TextD")]]

How to find xpath of a text if it is a substring of a string in the same dom

<div>
<p>BBC Radio 1</p>
<p>BBC Radio 1Xtra</p>
</div>
I want to locate the first element(containing text BBC Radio 1) using the xpath which contains tthe paragraph text. Something like : "//div[contains(text(),'BBC Radio 1')]".
However this xpath is pointing to both the <p> nodes. Is there a way to point to the first <p> node only using the node text in this situation ?
You can limit result of your XPath by using index 1 :
(your_initial_xpath_here)[1]
(//p[contains(text(),'BBC Radio 1')])[1]

Xpath - matching based on node() contains() content

I have the following HTML structure (there are many blocks using the same architecture):
<span id="mySpan">
<i>
Price
<b>
3 900
<small>€</small>
</b>
</i>
</span>
Now, I want to get the content of <b> using Xpath which I tried like so:
//span[#id="mySpan"]/i/node()[1][contains(text(),"Price")]
which does match anything. How can I match this using the node()[1] text as anchor?
Regarding the Xpath you tried, instead of text() which return text node child, simply use . :
//span[#id="mySpan"]/i/node()[1][contains(.,"Price")]
For the ultimate goal, I'd suggest this XPath :
//span[#id="mySpan"]/i[contains(.,"Price")]/b
or if you want specifically to match against the first node within <i> :
//span[#id="mySpan"]/i[contains(node(),"Price")]/b

How do I write an xpath expression for an element following another with certain text?

How do I construct an xpath expression in which I want to get the <p> element following a <p> element with a child <strong> element with the text, "About this event:".
In other words, what path expression will give me the "<p>" element with the "Hello" text following the <P> with the <Strong> text below …
<p>
<strong>About this event:</strong>
…
</p>
<p>Hello</p>
? -
//p[strong[.='About this event:']]/following-sibling::p
or
//p[strong[.='About this event:']]/following-sibling::p[.='Hello']
Try this:
//p[normalize-space(strong)='About this event:']/following-sibling::p
You can also narrow this down to the first p by adding [1]:
//p[normalize-space(strong)='About this event:']/following-sibling::p[1]
I believe this should be do-able. Here's the output from Perl's xpath tool:
so zacharyyoung$ xpath so.xml '//p[strong = "About this event:"]/following-sibling::p[1]'
Found 1 nodes:
-- NODE --
<p>Hello</p>
The XPath is //p[strong = "About this event:"]/following-sibling::p[1]

Resources