I'm trying to get the current position of a xpath match. Here is a real world example
on this page http://newyork.backpage.com/homes-for-sale/
running the following xpath matches the 8th listing counting from top
//div[contains(#class, 'cat 93893742')]
I want to somehow get the ad position using xpath which at the time of posting this question is "8". I tried using prececeding-sibling::div but I am getting unexpected results.
Anyway to achieve this with xpath?
I'm not sure wether current version of htmlunit supports XPath 2.0, but if so you can use below expression:
index-of(//div[starts-with(#class, "cat")], //div[#class='cat 93893742'])
This will return 10 - position in common list
If you want to get position in list for specific date (Thu. May. 11) you can try:
index-of(//div[normalize-space()="Thu. May. 11"]/following::div[starts-with(#class, "cat")],//div[normalize-space()="Thu. May. 11"]/following::div[#class='cat 93893742'])
which returns 8
Some description added in #har07 answer
I think this is what you required
count(//div[contains(#class, 'cat 93893742')]/preceding-sibling::div[starts-with(#class,'cat')])+1
Lets breakdown the whole
//div[contains(#class, 'cat 93893742')]
will match the required context node which have classname = cat 93893742
/preceding-sibling::div[starts-with(#class,'cat')]
Will match the all div element starts with classname=cat just before your context node
So if we keep all those in count() it will count all div tag just before the context node So add 1 to include count of context node as well
If you want to point that element by using index like calculated above then add this
//div[starts-with(#class,'cat')][count(//div[contains(#class, 'cat 93893742')]/preceding-sibling::div[starts-with(#class,'cat')])+1]
Equal to
//div[starts-with(#class,'cat')][10] // 10 in index number
Based on this and your previous question, maybe the following XPath is what you're looking for :
count(
//div[contains(#class, 'cat 93893742')]/preceding-sibling::div[contains(#class, 'cat ')]
)+1
Related
I have currently got a situation where I need to get all elements on a page, iterate through the elements 1 by 1 and within each element, I want to see if a certain other element exists.
To make it more clear, this is my loop:
${elements} = Get WebElements //li[#class='product-item--row js_item_root ']
FOR ${item} IN #{elements}
This is the part where I need to check within ${item} if the xpath exists:
//*[contains(text(), '${companyname}')]
END
So basically, I have 24 elements on my page which have the xpath
//li[#class='product-item--row js_item_root ']
I have 1 element on my page which can be located by xpath
//li[#class='product-item--row js_item_root ']//*[contains(text(), '${companyname}')]
And I want to know, within the 24 elements, which place the element is located which contains
//*[contains(text(), '${companyname}')]
Hoping someone can help!
Edit
This does not work:
Element should be visible ${item}//*[contains(text(), 'BargainsKing')]
And that's because:
Element with locator '<selenium.webdriver.remote.webelement.WebElement (session="4b40581d8835aac628e3a2032e355ee5", element="663438f7-76eb-4801-b255-021a865035dd")>//*[contains(text(), 'BargainsKing')]' not found.
Edit
I found the
${item.get_attribute('innerHTML')}
Now my next/final question is, can I look up an xpath within this innerHTML?
A solution is to count the number of elements and then get a FOR to cycle all elements
${count} Get Element Count XmlLocatorForAllElements
FOR ${i} IN RANGE 1 ${count}
${tmpElement} Get Element Count XmlLocator[${i}]/WithCompany
IF ${tmpElement} > ${0}
#element found
END
END
Could you please provide more detail about the selectors?
For now, I think you should try this:
You can check in the selector if there are any incremental values
if not then add a counter for each row
for selecting a specific column you can either go with the counter or the company name in Xpath
I am trying to write an XPath expression which can return the URL associated with the next page of a search.
The URL which leads to the next page of the search is always the href in the a tag following the tag span class="navCurrentPage" I have been trying to use a following-sibling term to pull the next URL. My search in the Chrome console is:
$x('//span[#class="navCurrentPage"][1]/following-sibling::a/#href[1]')
I thought by specifying #href[1] I would only get back one URL (thinking the [1] chooses the first element in list), but instead Chrome (and Scrapy) are returning four URLs. I don't understand why. Please help me to understand how to select the one URL that I am looking for.
Here is the URL where you can find the HTML giving me trouble:
https://www.yachtworld.com/core/listing/cache/searchResults.jsp?cit=true&slim=quick&ybw=&sm=3&searchtype=advancedsearch&Ntk=boatsEN&Ntt=&is=false&man=&hmid=102&ftid=101&enid=0&type=%28Sail%29&fromLength=35&toLength=50&fromYear=1985&toYear=2010&fromPrice=&toPrice=&luom=126¤cyid=100&city=&rid=100&rid=101&rid=104&rid=105&rid=107&rid=108&rid=112&rid=114&rid=115&rid=116&rid=128&rid=130&rid=153&pbsint=&boatsAddedSelected=-1
Thank you for the help.
Operator precedence: //x[1] means /descendant-or-self::node()/child::x[1] which finds every descendant x that is the first child of its parent. You want (//x)[1] which finds the first node among all the descendants named x.
xpath index will apply on all matching records, if you want to get only the first item, get the first instance.
$x('//span[#class="navCurrentPage"][1]/following-sibling::a/#href[1]').extract_first()
just add, .extract_first() or .get() to fetch the first item.
see the scrapy documentation here.
I've found this very helpful to make sure you have the bracket in the right place.
What is the XPath expression to find only the first occurrence?
also, the first occurrence may be [0] not [1]
I have a date picker on my website,
It contains a list of elements for each week
and then those include 7 elements for each day
2930311234
Now I'm trying to xpath find the last button with class "is-selected"
And I'd also like to go trough each week since I want the last possible date ( "is-selected" means avaible)
I've tried
.//div[#'available-dates-calendar']//table/?[last()='True']//button[last()='True']
But that gave me the first element...
For the last() condition to work, I think that you would have to use the spy to create a XPath query that finds all buttons.
When such a query is found, then you can use the last() condition to find the last in the list of items found.
Hope this helps.
I believe you can use the [-1] identifier to grab the last element. So, instead of [last()='True'], use [-1].
I try to write xpath expressions so that my tests won't be broken by small design changes. So instead of the expressions that Selenium IDE generates, I write my own.
Here's an issue:
//input[#name='question'][7]
This expression doesn't work at all. Input nodes named 'question' are spread across the page. They're not siblings.
I've tried using intermediate expression, but it also fails.
(//input[#name='question'])[2]
error = Error: Element (//input[#name='question'])[2] not found
That's why I suppose Seleniun has a wrong implementation of XPath.
According to XPath docs, the position predicate must filter by the position in the nodeset, so it must find the seventh input with the name 'question'. In Selenium this doesn't work. CSS selectors (:nth-of-kind) neither.
I had to write an expression that filters their common parents:
//*[contains(#class, 'question_section')][7]//input[#name='question']
Is this a Selenium specific issue, or I'm reading the specs wrong way? What can I do to make a shorter expression?
Here's an issue:
//input[#name='question'][7]
This expression doesn't work at all.
This is a FAQ.
[] has a higher priority than //.
The above expression selects every input element with #name = 'question', which is the 7th child of its parent -- and aparently the parents of input elements in the document that is not shown don't have so many input children.
Use (note the brackets):
(//input[#name='question'])[7]
This selects the 7th element input in the document that satisfies the conditions in the predicate.
Edit:
People, who know Selenium (Dave Hunt) suggest that the above expression is written in Selenium as:
xpath=(//input[#name='question'])[7]
If you want the 7th input with name attribute with a value of question in the source then try the following:
/descendant::input[#name='question'][7]
I have the following XML:
<ZMARA SEGMENT="1">
<MATERIAL>000000000030001004</MATERIAL>
<PRODUCT_GROUP>14000IAA</PRODUCT_GROUP>
<PRODUCT_GROUP_DESC>HER 30 AR NEW Size</PRODUCT_GROUP_DESC>
<CLASS_CODE>I046</CLASS_CODE>
<CLASS_CODE_DESC>Heritage 30</CLASS_CODE_DESC>
<CHARACTERISTICS_01>,001,PLANNING_ALERT_PERCENTAGE, 50.000,PLANNI</CHARACTERISTICS_01>
<CHARACTERISTICS_02>X,001,COLOR_ATTRIBUTE,Weathered Wood,WEWD,Col</CHARACTERISTICS_02>
<CHARACTERISTICS_03>,001,ARMA_UOM,SALES SQUARE,SSQ,ARMA UNIT OF M</CHARACTERISTICS_03>
<CHARACTERISTICS_04>,001,ARMA_A_CATEGORY,05-Below 260 Lam/Multi-l</CHARACTERISTICS_04>
</ZMARA>
Using XPath I need to select the CHARACTERISTICS_XX element whose value contains the COLOR_ATTRIBUTE token. It will not always be characteristics_02. Thanks for the help. I am a total noob at XPath.
This looks like its taken from a sap idoc, you can probably be lucky that the fieldnamed are not 6 character long abbreviations :)
The answer given by spinon is correct, however if there could be another element that contains the text 'COLOR_ATTRIBUTE', this would give a more specific match:
/ZMARA/*[starts-with(local-name(.), 'CHARACTERISTICS_')][contains(.,'COLOR_ATTRIBUTE')]
Another suggestion is to avoid the '//' expression if you know where the ZMARA element can occur, in the expression above ZMARA would only be searched as a root element which would be more performant.
This should work:
//ZMARA/*[contains(.,'COLOR_ATTRIBUTE')]