How to get Xpath for the following? - xpath

I have a xml file like following
<topic>
<title>Abstract
</title>
<body>
<p>
abstract data
</p>
</body>
</topic>
<topic>
<title>Keywords</title>
<body>
<p>
keywords data
</p>
</body>
</topic>
I have to check if title is "Keywords" than show the <p>text in </p>.
can anyone help me to get the exact xpath for this?
Thanks in advance

Try this one and let me know the result:
//title[text()="Keywords"]/following::p
or
//topic[title[text()="Keywords"]]//p

//title[text()="Keywords"]/body/p
for text only
//title[text()="Keywords"]/body/p/text()
please avoid double slash "//" and following, it will travel all the P tag

Try this below xpath
//title[text()="Keywords"]/following::p
Explanation of xpath:- Start your xpath with <title> along with text method and move ahead to the <p> tag using the following keyword.

Related

get Xpath data comes from External JS

It is simple example HTML for demonstration my issue
<!DOCTYPE html>
<html>
<body>
<div>
<label>This value comes from internal</label>
<div>
<div name='internal'>$11.11</div>
</div>
<div>
<label>This value comes from external</label>
<div>
<input type ='text' name='internal' readonly='true'>
</div>
</div>
</body>
</html>
//Display in Web Browser
This value comes from internal
11.11
This value comes from external
55.55
I want to get $55.55, but I searched "//*[text()='$55.55']" with inspector but I could not find any $55.55
and I figured out this $55.55 value comes from external JS and changed DOM and display.
This value displayed on browser but I could not get Xpath of input value
How can I get Xpath get this value "$55.55"
Thank you
Well, I dont know if I correctly understand your question, but if your goal is to get the input's value you can do something easy as, without the need of XPath:
var myInputValue = document.querySelector('input[name=internal]').value;
console.log(myInputValue);

XPATH in difficult span - Scrapy

I use Scrapy and write script based on XPATH selector. I try search XPATH syntax to collect two value: price and EAN number (500.02, 08043687312822). Price: 500,2 and EAN: 08043687312822
<div class="emProductPrice">
<span itemprop="offers" itemscope="" itemtype="http://example.com/Offer">
<span
itemprop="price" content="500.02">500,02</span> hrywna<meta itempr\
op="priceCurrency" content="PLN">
<meta itemprop="gtin14"
content="08043687312822">
<link itemprop="itemCondition"
href="http://example.com/NewCondition">
<l\ink itemprop="availability" href="http://example.com/InStock">
</span>
</div>
I try write syntax something like: //div[#class="emProductPrice"/span/span/text() but i get only: &nbsp. I need 500,02 for example
How do this? Please help.
You need:
price = response.xpath('//span[#itemprop="price"]/#content').extract_first()
ean = response.xpath('//meta[#itemprop="gtin14"]/#content').extract_first()

xpath retrieving text inclusive of tag

I trying to parse a webpage and get all the content inside a div tag named div1. I tried ('div[#class="div1"]') which gives me the content below
<div class="div1">
<p>
something something <br>
abc<br>
def
</p>
</div>
However, I am trying to get everything that is inside the div tag, not including the div tag as shown below
<p>
something something <br>
abc<br>
def
</p>
Try changing your xpath to
div[#class="div1"]/child::*
Quote from https://www.w3.org/TR/xpath/#location-paths:
child::* selects all element children of the context node
For one thing, you're looking for #id when it's #class

how to get specific text after a div with xpath

I get trouble to get specific texts which are located between two tags.
I mean, want to get Text after em tag. I want to get this. and also text after this p tag. I also want to get this..
is there any way of doing that?
thanks in advance.
<article>
<h1 id='h1'>Heading 1</h1>
<img src='mypath/pictures/pic.jpg'></img>
<p></p>
<div id='div1'>
<time datetime='2016'>2016</time>
</div>
<br></br>
<em>my location, TN, United States</em>
Text after em tag. I want to get this.
<p></p>
text after this p tag. I also want to get this.
<div id='div2'>
</div>
</article>
you can get the following sibling texts by using
following-sibling::text()
so to get all the em after text
//em/following-sibling::text()[1]
the same will be for p tag, and then join them
string-join(em/following-sibling::text()[1] | p/following-sibling::text()[1] , ',')
I hope this could help!

xpath find attribute by id and get the attribute parent content

I have an XML-structure that looks like this:
<document>
<body>
<section>
<title>something</title>
<subtitle>Something again</subtitle>
<section>
<p xml:id="1234">Some text</p>
</section>
</section>
<section>
<title>something2</title>
<subtitle>Something again2</subtitle>
<section>
<p xml:id="12345678">Some text2</p>
</section>
</section>
</body>
</document>
What i want to is to find search for the attribute xml:id containing 12345678 and once found, get the previous sibling (subtitle) content. Is this possible with xpath? I have this:
//p[contains(#xml:id,'12345678')]/preceding-sibling::subtitle
If I have understood the post correctly, for the specific query that you have put, the expected answer is Something Again2. You can use the following query to do this:
UPDATED as the document schema is changed
//section[section/p[#xml:id="12345678"]]/subtitle

Resources