Xpath Index wont work - xpath

I need the third image with that class and parent. None of these xpaths seem to be valid.
xpath=(//div[#class='itemTileV5'])//img[#class='dealItem']/#src[3]
xpath=(//div[#class='itemTileV5']//img[#class='dealItem'])/#src[3]
xpath=(//div[#class='itemTileV5']//img[#class='dealItem']/#src)[3]
Notice I move the parentheses around and it's always an invalid path. Without parentheses it won't work either.
Please help.
<div class="itemTileV5">
<div class="top">
<a href="/Grocery_deals/p_pepperidge-farm-goldfish-variety-pack-bold-mix-29-4-ounce">
<img class="Item" src="https://img.google.com/ai/184x184/dealimage/1493649114.jpg" alt="Pepperidge Farm">
</a>
</div>
</div>

All three of your expressions are valid in all versions of XPath. If you're getting an error, please tell us what it is, and what XPath processor generated it.
The first two expressions aren't useful, because #src[3] selects the third attribute called "src" and there can only be one attribute with a given name.
Your informal requirement "the third image with that class and parent" seems to translate to (//div[#class='itemTileV5']/img[#class='dealItem'])[3]/#src

Related

How to access second element using relative Xpath

Given this page snippet
<section id="mysection">
<div>
<div>
<div>
<a href="">
<div>first</div>
</a>
</div>
<div>
<a href="">
<div>second</div>
</a>
</div>
</div>
</div>
</section>
I want to access the second a-element using relative Xpath. In FF (and locating with Selenium IDE) this
//section[#id='mysection']//a[1]
works but this does not match
//section[#id='mysection']//a[2]
What is wrong with the second expression?
EDIT: Actually I do not care so much about Selenium IDE (just use it for quick verification). I want to get it going with selenium2library in Robot Framework. Here, the output is:
ValueError: Element locator with prefix '(//section[#id' is not
supported
for the suggested solution (//section[#id='mysection']//a)[2]
You can use this. This would select the anchor descendants of section and get you the second node. This works with xslt processor, hope this works with Selenium
//section[#id='mysection']/descendant::a[2]
Try this way instead :
(//section[#id='mysection']//a)[2]
//a[2] looks for <a> element within the same parent. Since each parent <div> only contains one <a> child, your xpath didn't match anything.
With this:
//section[#id='mysection']//a[1]
you are matching all first 'a' elements within any context (inside one div, for example), but with this
//section[#id='mysection']//a[2]
you are trying to match any second 'a' element with any context, but you dont have more than one 'a' element in any of nodes.
The icrementing sibling node thus should be a parent div node to those 'a' tags.
Very simple:
//section[#id='mysection']//a[1] - both elements
This is why previous answer with paranthesis around the whole thing is correct.
//section[#id='mysection']//div[1]/a - only first element
//section[#id='mysection']//div[2]/a - only second elemnt
Other way to mach each 'a' separately:
//section[#id='mysection']//a[div[text()='first']]
//section[#id='mysection']//a[div[text()='second']]
Other ways to reach to the second a-element can be by using the
<div>second</div>, call this bottom-up approach
instead of starting from section-element
<section id="mysection">, call this top-down approach
Using the div child of a-element, the solutions should look like this:
//div[.='second']/..

XPath expression -hierarchy

<div class="summary-item">
<label >Price</label>
<div class="value">
0.99 GBP
</div>
</div>
<div class="summary-item">
<label >Other info</label>
<div class="value">
All languages
</div>
</div>
I am trying to get the "0.99 GBP" using an XPath expression, so far I have reached the label using this (note there is another class by the name summary-item, therefore I need to uniquely identify with the label name Price)
sel.xpath('//*/div[#class="summary-item"]/label[text()="Price"]').extract()
However, I am unable to get to the class, I tried using following-sibling, but I did not succeed, any help will be appreciated.
The existence of child nodes can be part of the predicate. Put the test for label into a predicate for the parent, either as a separate predicate (adding the target node as well):
//div[#class="summary-item"][label[text()="Price"]]/div[#class="value"]
or joined with and:
//div[#class="summary-item" and label[text()="Price"]]/div[#class="value"]
(Note you don’t need //*/div at the start.)
You could use following-sibling if you wanted, it would look like this:
//div[#class="summary-item"]/label[text()="Price"]/following-sibling::div[#class="value"]
(here the label div isn’t part of the predicate).
One more thing to be aware of, using XPath to select HTML classes doesn’t work the same as using CSS – XPath will only match the exact string whereas CSS matches even if the element is in more than one class. In this case it works out okay but you should watch out for it. Search StackOverflow if it will be an issue, there are a few answers descibing it.

xpath accessing information in nodes

i need to scrap information form a website contain the property details.
<div class="inner">
<div class="col">
<h2>House in Digana </h2>
<div class="meta">
<div class="date"></div>
<span class="category">Houses</span>,
<span class="location">Kandy</span>
</div>
</div>
<div class="attr polar">
<span class="data">Rs. 3,600,000</span>
</div>
what is the xpath notation for "Kandy" and "Rs. 3,600,000" ?
It is not wise to address text nodes directly using text() because of nuances in an XML document.
Rather, addressing an element node directly returns the concatenation of all descendant text nodes as the element value, which is what people usually want (and think they are getting when they address text nodes).
The canonical example I use in the classroom is this example of OCR'ed content as XML:
<cost>39<!--that 9 may be an 8-->.22</cost>
The value of the element using the XPath address cost is "39.22", but in XSLT 1.0 the value of the XPath address cost/text() is "39" which is not complete. In XSLT 2.0 (which is how the question is tagged), you get two text nodes "39" and ".22", which if you concatenate them it looks correct. But, if you pass them to a function requiring a singleton argument, you will get a run-time error. When you address an element, the text returned is concatenated into a single string, which is suitable for a singleton argument.
I tell students that in all of my professional work there are only very (very!) few times that I ever have to use text() in my stylesheets.
So //span[#class='location' or #class='data'] would find the two fields if those were the only such elements in the entire document. You may need to use ".//span" from a location inside of the document tree.

Google Spreadsheet importxml timestamp

I been trying for over 2 hours to import timestamp from zap2it.com link to my google spreasheet.
Here is link I am trying to importxml from.
http://affiliate.zap2it.com/tvlistings/ZCGrid.do?zipcode=78238&lineupId=DISH641:-
Here is what I am tryign to import
Here is what I tried so far
=importxml("http://affiliate.zap2it.com/tvlistings/ZCGrid.do?aid=dish&pkg=8388608&fromProvider=true&zipcode=78238&x=52&y=18"&B1,"//body//div[3]/div/div/div[3]/div/div")
EDIT
I was able to improve and get better results
//body//div[3]/div/div/div[1]//*
but it shows timestamp from all over the page. not exactly what I need.
[The first complication is that the data stream returned from dereferencing that URI is not actually XML; it has several thousand well-formedness errors (unescaped ampersands in URIs, unescaped ampersands and less-than signs in scripts, some embedded HTML, some miscellaneous errors). Since you're not reporting problems from that, however, I'll assume that somewhere between the server and your XPath expression someone is doing some tidying.]
I think you'll get better results if you use the id and class attributes that are extensively used in the document. The material you want looks like this in the source (you can use any browser-based debugging tool to find it; I used the 'Web Inspector' in Safari); I have indented to make the structure more visible, and fixed some well-formedness errors in one of the a elements (missing whitespace between attribute-value pairs).
<div class="zc-tn" id="zc-tn-top">
<div class="zc-tn-i">
<a href="ZCGrid.do?fromTimeInMillis=1355781600000"
class="zc-tn-l"
title="Move the grid three hours earlier"></a>
<div class="zc-tn-c">
<span class="zc-tn-z"
title="Central Standard Time">CST</span>
<div class="zc-tn-t">7:00 PM</div>
<div class="zc-tn-t">7:30 PM</div>
<div class="zc-tn-t">8:00 PM</div>
<div class="zc-tn-t">8:30 PM</div>
<div class="zc-tn-t">9:00 PM</div>
<div class="zc-tn-t">9:30 PM</div>
</div>
<a href="ZCGrid.do?fromTimeInMillis=1355803200000"
class="zc-tn-r"
title="Advance the grid three hours"></a>
</div>
</div>
A simple search verifies that the value zc-tn-top is indeed unique as an ID value in the document. Given that, a simple XPath expression to retrieve all the elements whose display is circled in your image is (assuming xhtml is bound to the XHTML namespace):
//xhtml:div[#id='zc-tn-top']//xhtml:div[#class='zc-tn-t']
It looks from your question as if your XPath evaluator is namespace-challenged or namespace-oblivious, so you may need to write this as
//div[#id='zc-tn-top']//div[#class='zc-tn-t']

discover a certain part of a page with selenium

I have a webpage looks something like this:
<html>
...
<div id="menu">
...
<ul id="listOfItems">
<!--- repeated block start -->
<li id="item" class="itemClass">
...
<span class="spanClass"><span class="title">title</span></span>
...
</li>
<!-- repeated block end-->
<li id="item" class="itemClass">
...
<span class="spanClass"><span class="title">title something</span></span>
...
</li>
<li id="item" class="itemClass">
...
<span class="spanClass"><span class="title">title other thing</span></span>
...
</li>
</ul>
...
</div>
...
</html>
I would like to know what is the xpath of the titles ("title", "title something", "title other thing"). The point is that the order of the <li> elements are not specified. It could be different after every page loading. Is there any method how to discover a certain structure of the page with xpath? I have an notion about how to solve this issue, but before I'm going to write iterations with C# to discover the page I ask you.
Thanks in advance!
First of all, id's should be unique, so your portrayed webpage would not work well when it comes to testing.
I did however test, and got some XPath locators to work for selecting specific titles (although I recommend you fix your webpage instead of actually using this):
//li[#id='item']/span/span
//li[#id='item'][1]/span/span
//li[#id='item'][3]/span/span
If you're after all three titles, you could try Dimitre Novatchev's suggestion:
//span[#class='title']
This should get all titles on the page.
I would like to say one thing however, if you're getting into Selenium, I recommend you download the Selenium IDE extension for Firefox. It's a great tool for beginners. It helps you both to make your Selenium tests by recording your clicks on a website, and it also helps you auto-generate and test your XPath locators and other locators.
And again: I urge you to not make a website with duplicate id elements :-)
Does Selenium support XPath expressions like:
//span[#class='title']
If yes, than use the above XPath expression. It selects every span element in the XML document, whose class attribute has string value of "title".
I recommend to use a tool like the XPath Visualizer to play with different XPath expressions and see the selected nodes highlighted in the source XML document.

Resources