get specifc node position using xpath - xpath

I've been strugguling to select a specifc node.
My query consists of two operations. First I need to get the position of a node using this:
count((//th[text()="TEST"])[1]/preceding-sibling::*)+1
and then I do this
(//th[position() = count((//th[text()="TEST"])[1]/preceding-sibling::*)+1])
When I test this query on chrome, it doesn't work. but when I execute them separately it works. Any idea what is wrong ?
Here is the HTML code:
<tbody>
<tr>
<th>TEST</th>
<th>TVA réduite<sup>(2)</sup></th>
<th>Prix TVA 20%</th>
<th>Surface</th>
<th>Etage</th>
<th>Orientation</th>
<th>Cave</th>
</tr>
<tr data-lot-id="0098__981104102">
<td>My Value</td>
<td>462 000€</td>
<td><strong>525 500 €</strong></td>
<td>100.23m²</td>
<td>10</td>
<td>Est</td>
<td>Non</td>
</tr>
</tbody>

I believe you are trying to get the value based on the header.
Scenario 1: Get td value based on th (header)
//td[position()=count(//th[normalize-space(.)='TEST']/preceding-sibling::th)+1]
Screenshot:
Scenario 2: Select the header based on the header text position(you don't need such a complex xpath for this simple though ;-) )
//th[position()=count(//th[normalize-space(.)='TEST']/preceding-sibling::th)+1]
Screenshot:

Related

correct way to scrape this table (using scrapy / xpath)

Given a table (unknown number of <tr> but always three <td>, and sometimes containing a strikethrough (<s>) of the first element which should be captured as additional item (with value 0 or 1))
<table id="my_id">
<tr>
<td>A1</td>
<td>A2</td>
<td>A3</td>
</tr>
<tr>
<td><s>B1</s></td>
<td>B2</td>
<td>B3</td>
</tr>
...
</table>
Where scraping should yield [[A1,A2,A3,0],[B1,B2,B3,1], ...], I currently try along those lines:
my_xpath = response.xpath("//table[#id='my_id']")
for my_cell in my_xpath.xpath(".//tr"):
print('record 0:', my_cell.xpath(".//td")[0])
print('record 1:', my_cell.xpath(".//td")[1])
print('record 2:', my_cell.xpath(".//td")[2])
And in principle it works (e.g. by adding a pipeline after add_xpath()), just I am sure there is a more natural and elegant way to do this.
Try contains :
my_xpath = response.xpath("//table[contains(#id, 'my_id')]").getall()

Cypress - click hyperlink on row based on value of two cells

Trying to get a Cypress script to click a hyperlink based on two values - the text of the hyperlink in column 1 and the value of the cell in the second column:
<tbody>
<tr>
<td>Anything</td>
<td>Casualty</td>
</tr>
<tr>
<td>Declined Prospect</td>
<td>Casualty</td>
</tr>
<tr>
<td>Declined Prospect</td>
<td>Package</td>
</tr>
<tr>
<td>Declined Prospect</td>
<td>Casualty</td>
</tr>
<tr>
<td>Irrelevant</td>
<td>Package</td>
</tr>
</tbody>
cy.get('a').contains('Declined Prospect').click()
fails because there's more than one hyperlink with that value. The id is not useful because it's dynamic.
In the example above, I want to click Declined Prospect when the second column is Casualty (but the order of the rows may vary and values in the first and second column are repeated - but only once for the combination).
Any thoughts?
The trick is to target <td>Casualty</td> then click the preceding <td><a>.
There are quite a few ways to get to sibling elements, the simplest in this case is prev().
cy.contains('td', 'Casualty') // target the 'marker' element
.prev() // move to the previous sibling
.click()
Approach from row and move inwards
To target a row with a specific combination of text in some of it's cells, just concatenate the text and use contains().
cy.contains('tr', 'Declined Prospect Casualty') // target unique text within children
This even works when there are other cells with text that's not relevent to the search, e.g
<tr>
<td>Declined Prospect</td>
<td>Casualty</td>
<td>Irrelevent text here</td>
</tr>
Then you can walk down the HTML tree,
cy.contains('tr', 'Declined Prospect Casualty') // target unique text within children
.find('td a') // descendant of previous subject
.click()
I think this can be useful: https://github.com/cypress-io/cypress-xpath
You can create selectors in xpath instead of css and in xpath you can search tree by text. e.g:
cy.xpath("//text() = 'Declined Prospect'")
==================================Edited====================
You can merge couples of xpatch selectors: it will looks like this //tr[td='Casualty']/td/a
Playgroud

Xpath to match following sibling in another node

This is my html code:
<tr>
<th class="left_cont"><strong>Hello world</strong></th>
<td class="right_cont padding_left16px"><strong>Hi There</strong></td>
</tr>
Now to select the text Hellow world i used.
//strong[contains(text(),'Hello world')]
Works fine for me.
Now I need to select the text Hi there relatively to the hello world text.
I need to do something like this but I can't figure out.
//strong[contains(text(),'Hello world')]/following-sibling::strong
Doesn't work out for me.
Elements with sibling relations are parent of <strong> instead of <strong> it self, so you can try this way :
//*[strong[contains(.,'Hello world')]]/following-sibling::*[strong]/strong
Or if you are sure parents involved are always <th> and <td> :
//th[strong[contains(.,'Hello world')]]/following-sibling::td[strong]/strong
2nd "strong" element is not actually sibling of the first one. But wrapping "td" elements are siblings. So you could probably use
//strong[contains(text(),'Hello world')]/../following-sibling::td/strong

Find the right node with xpath

I have a table containing tds like the one below.
Im trying to get a hold of the href-part only.
Now i got something like this:
var aTags = htmlDocument.DocumentNode.SelectNodes("//td//a[#href]");
It seems to be returning all the info in the td. How can I specify that I only want that href? There are many similar questions here but I cant seem to get it to work.
<tbody>
<tr>
<td colspan="1" rowspan="1">
<a shape="rect" id="ctl00_mainCPH_ResultListUC_ResultList_ctl04_hlRubrik" href="/sitevision/proxy/4.38a41afd11d99fbdb65800016.html/svid12_38a41afd11d99fbdb65800021/-123388378/Standard/Platsannonser/VisaFritextAnnonser.aspx?ids=2499859&q=s%28sn%28systemutvecklare%29sida%281%29ar%2820%29%29" style="display:inline-block;width:160px;">Systemutvecklare</a>
</td>
</tr>
</tbody>
. Every objects has for example an outerHtml-property looking like the a tag above,
what I need is yo get the hrefs and collect sthem in a list of strings..
The image below shows that the value i want actually exists in the objects im getting, i want the value of the hrefs...
EDIT:
I seem to be able to get the innerhtml like this:
var bTags = htmlDocument.DocumentNode.SelectNodes("//td//a/#href").Select(o => o.InnerHtml).ToList();
But I still dont know how to get the hrefs...
Your XPath will get you all a elements that have an attribute named href. To get the attribute itself, you need to use //td//a/#href.
This code seems to do what i wanted:
var bTags = htmlDocument.DocumentNode.SelectNodes("//td//a/#href").Select(o => o.Attributes["href"].Value).ToList();

Finding if a content exists in a given column in a table, with Capybara or Xpath?

Problem:
Given a table, a specific piece of content should appear in the same column as a specific header.
Clarification:
I can not test the column position numerically, or at least I can't hardcode it that way, since the number of columns can change based on various other conditions and I don't want to make my test that fragile.
Example:
Name || Phone Number || Address
==============================================================
... || ... || ...
Joe || 555-787-7878 || 42 Nowhere Lane, Mulberry, California
... || ... ||
With the code looking like so:
<table>
<tr>
<th>Name</th>
<th>Phone Number</th>
<th>Registered</th>
<th>Address</th>
</tr>
<tr>
...
</tr>
<tr>
<td>Joe</td>
<td>555-377-7347</td>
<td>Yes</td>
<td>42 Nowhere Lane1, Mulberry1, California</td>
</tr>
<tr>
<td>Jerry</td>
<td>555-787-7878</td>
<td>Yes</td>
<td>50 Nowhere Lane, Mulberry, California</td>
</tr>
<tr>
<td>Tom</td>
<td>555-678-0987</td>
<td>No</td>
<td>43 Nowhere Lane2, Mulberry2, California</td>
</tr>
<tr>
...
</tr>
</table>
Scenario:
I want to insure the correct address (42 Nowhere...) appears in the column with the header "Address".
How can I do this?
The solution might be as simple as a decent xpath query, to be honest, perhaps I don't even need anything particularly "Capybara" related!
I came across similar one, but here I need to check whether 'Jerry' registered or not. Please help me how can i automate using ruby/capybara
Thanks in Advance
I think #Snekse is very close. Here are some expressions that have been tested. The following returns the table cell corresponding to the th whose value is Address:
/table/tr/td[(count(../../tr/th[.='Address']/preceding-sibling::*)+1)]
This can be used to get the value of the cell:
42 Nowhere Lane, Mulberry, California
...which you could then use to perform the final test:
/table/tr/td[(count(../../tr/th[.='Address']/preceding-sibling::*)+1)]
[.='42 Nowhere Lane, Mulberry, California']
This will return the cell itself, if the addresses match. It will return an empty node-set, which will evaluate to false in a boolean context, if no such node exists.
You probably also need to specify the row you're testing, which can be done with a slight modification to the start of the expression:
/table/tr[$n]/<rest_of_expression>
...where $n is the index of the row to select.
I would think you would first have to get the position of Address, then use that information to get the value from the data table.
So something like:
var x = count(/table/tr/th[.='Address']/preceding-sibling::*)+1.
var address = /table/tr/td[position()=${x}]
Or combine it into something like:
/table/tr/td[position()=(count(/table/tr/th[.='Address']/preceding-sibling::*)+1.)]
NOTE: I didn't run these statements, so I have no idea if they are valid xpath syntax or not, but it should be close enough to get you there.
REFERENCE: Find position of a node using xPath

Resources