how to exclude a table inside in another table in xpath? - xpath

I have the follow html file:
<table class="pd-table">
<caption> Tech </caption>
<tbody>
<tr data-group="1">
<td> Electrical </td>
<td> Design </td>
<tr data-group="1">
<td> Output </td>
<td> Function </td>
<tr data-group="7">
<td> EMC </td>
<table>
<tbody>
<tr>
<td> EN 6547 ESD </td>
<td> EN 8901 ESD </td>
<tr data-group="8">
<td> Weight [8] </td>
<td> 27.7 </td>
I can isolate EN 6547 ESD and EN 8901 ESD with the follow xpath:
//table[#class="pd-table"]//tbody//tr//td/table//tr//td/text()').getall()
Any other way is always welcome :)
Another data which I would like to get is to get all the rest of the data without the previous isolated.
Is there any way to do it? :)

Looks like table tag is not closed properly in data-group-7...
Anyway in such cases you can stick to text content of the cell using contains() or text()="some exact text"
response.xpath('//td[contains(text(), "EMC")]').css('td~table tbody td::text').extract()

Your used Xpath uses a lot of unwanted double slash.
See meaning of double slash in Xpath.
The less you use double slash, the better it will perform.
So just use single slash like this:
//table[#class="pd-table"]/tbody/tr/td/table/tr/td/text()
Another way of selecting td's that have two ancestor::table
//td[count(ancestor::table)=2]/text()
And that leads to the answer of your second question:
//td[count(ancestor::table)=1]/text()
An other possibility would just be:
//table[#class="pd-table"]/tbody/tr/td/text()
Or(assuming the second tabel does not have tr's with #data-group):
//tr[#data-group]/td/text()
So you see there are many Xpath's lead to Rome ;-).

Related

XPath find text according last word in the string

I need to find the whole text according last word in the string. I have something like this:
<table>
<tr>
<td style='white-space:nowrap;'>
<a href=''>test</a>
</td>
<td>any text</td>
<td>text text texttofind</td>
<td>Not Available</td>
<td class='aui-lozenge aui-lozenge-default'>text</td>
</tr>
<tr>
<td style='white-space:nowrap;'>
<a href=''>test</a>
</td>
<td>any text</td>
<td>text text texttofind2</td>
<td>Not Available</td>
<td class='aui-lozenge aui-lozenge-default'>text</td>
</tr>
<tr>
<td style='white-space:nowrap;'>
<a href=''>test</a>
</td>
<td>any text</td>
<td>text text texttofind3</td>
<td>Not Available</td>
<td class='aui-lozenge aui-lozenge-default'>text</td>
</tr>
</table>
I need to find whole text vallue according last word texttofind
<td>text text texttofind</td>
I cant use contains, because it will find multiple values. I need something like ends-with but I am using xpath 1.0.
I tried something like this, but I am not sure what is wrong because it is not working
//tr[substring(., string-length(#td)
- string-length('texttofind') + 1) = 'texttofind']
or maybe it would be better to use matches?
You're almost there; try changing your xpath expression to
//tr//td[substring(., string-length(.)
- string-length('texttofind') + 1) = 'texttofind']
and see if it works.

Fetch parent of a specific row in a table without iteration

Consider the below table structure contains many rows with multiple column values. I need to identify the parent of specific row, which has to be identified using the cell .
<table class = 'grid'>
<thead id = 'header'>
</thead>
<tbody>
<tr>
<td>
<span class="group">
<span class="group__link"><a class="disabledlink"">copy</a>
</span>
</span>
</td>
<td class="COLUMNNAME">ACE</td>
<td class="COLUMNLONGNAME">Adverse Childhood Experiences</td>
<li>Family Medicine</li>
<li>General Practice</li>
</td>
<td class="COLUMNSEXFILTER">Both</td>
<td class="COLUMNAGEFILTERMIN">Any</td>
<td class="COLUMNTYPE">Score Only</td>
</tr>
<tr>
<td class="nowrap" showactionitem="2">
<span class="group">
<span class="group__link"><a onclick="Check()" href="#">copy</a>
</span>
</span>
</td>
<td class="COLUMNNAME">AM-PAC</td>
<td class="COLUMNLONGNAME">AM-PAC Generic Outpatient Basic Mobility Short Form</td>
<td class="COLUMNNOTE"></td>
<td class="COLUMNRESTRICTEDYN">No</td>
<td class="COLUMNSPECIALTYID"></td>
<td class="COLUMNSEXFILTER">Both</td>
<td class="COLUMNAGEFILTERMIN">Any</td>
<td class="COLUMNTYPE">Score Only</td>
</tr>
<tr></tr>
<tr></tr>
</tbody></thead>
</table>
Likewise this table contains around 100 rows. I did the same using iteration and it is working fine.
Is it possible to find the parent of specific row without iteration?
You can use the parent method to find the parent of an element. Assuming that you have located a table cell, let's call it cell, you can get its row using parent and then the parent of the row with another call to parent:
cell.parent
#=> a <tr> element
cell.parent.parent
#=> the parent of the specific row - a <tbody> element in this case
Chaining multiple parent calls can become tedious and difficult to maintain. For example, you would have to call parent 4 times to get the table cell of the "copy" link. If you are after an ancestor (ie not immediate parent), you are better off using XPath:
cell.table(xpath: './ancestor::table')
#=> the <table> element containing the cell
browser.link(text: 'copy').tr(xpath: './ancestor::tr')
#=> the <tr> element containing a copy link
Hopefully Issue 451 will be implemented soon, which will remove the need for XPath. You would be able to call:
cell.parent(tag_name: 'table') # equivalent to `cell.table(xpath: './ancestor::table')`
There's no need for anything fancy, Watir has an Element#parent method.
You can use this one:
parent::node()
The below example will selects the parent node of the input tag of Id='email'.
Ex: //input[#id='email']/parent::*
the above can also be re-written as
//input[#id='email']/..
XPath tutorial for Selenium

Import data from HTML page using feeds importer in drupal

I'm trying to import some data from a HTML page with feeds importer. The context is this:
<table class="tabela">
<tr valign="TOP">
<td class="formulario-legenda">Nome:</td>
<td nowrap="nowrap">
<b>Raul Fernando de Almeida Moreira Vidal</b>
</td>
</tr>
<tr valign="TOP">
<td class="formulario-legenda">Sigla:</td>
<td>
<b>RMV</b>
</td>
</tr>
<tr valign="TOP">
<td class="formulario-legenda">Código:</td>
<td>206415</td>
</tr>
<tr valign="TOP">
<td class="formulario-legenda">Estado:</td>
<td>Ativo</td>
</tr>
</table>
<table>
<tr>
<td class="topo">
<table>
<tr>
<td class="formulario-legenda">Categoria:</td>
<td>Professor Associado</td>
</tr>
<tr>
<td class="formulario-legenda">Carreira:</td>
<td>Pessoal Docente de Universidades</td>
</tr>
<tr>
<td class="formulario-legenda">Grupo profissional:</td>
<td>Docente</td>
</tr>
<tr valign="TOP">
<td class="formulario-legenda">Departamento:</td>
<td>
<a href="uni_geral.unidade_view?pv_unidade=151"
title="Departamento de Engenharia Informática">Departamento de Engenharia Informática</a>
</td>
</tr>
</table>
</td>
</tr>
</table>
I tried with this:
/html/body/div/div/div/div/div/div/div/table/tbody/tr/td/table/tbody/tr[1]/td[2]
but nothing appears. Can someone help me with the right syntax to obtain "Grupo Profissional"?
Quick answer that might work
Considering just the HTML sample you provided (which only has two tables) you can select the text you want using this expression, based on the table's position:
//table[2]//tr[3]/td[1]/text()
This will work in the HTML you pasted above. But it might not work in your actual scenario, since you might have other tables, the table you want to select has no ID and you didn't suggest some invariant text in your code which could be used to anchor the context for the expression. Assuming the initial part of your XPath expression (the div sequence) is correct, you might be able to use:
/html/body/div/div/div/div/div/div/div/table[2]//tr[3]/td[1]/text()
But it's wuite a fragile expression and vulnerable to any changes in the document.
A (possibly) better solution
A better alternative is to look for some identifier you could use. I can only guess, since I don't know your code. In your sample code, I would guess that Codigo and the number following it 206415 might be some identifier. If it is, you could use it to anchor your context. First you select it:
//table[.//td[text()='Código:']/following-sibling::td='206415']
The expression above will select the table which contains a td with the exact text Código: followed by a td containing the exact text 206415. This will create a unique context (considering that the number is an unique identifier). From that context, you can now select the text you want, which is inside the next table (following-sibling::table[1]). This is the context of the second table:
//table[.//td[text()='Código:']/following-sibling::td='206415']/following-sibling::table[1]
And this should select the text you want (Grupo profissional:) which is in the third row tr[3] and first cell/column td[1] of that table:
//table[.//td[text()='Código:']/following-sibling::td='206415']/following-sibling::table[1]//tr[3]/td[1]/text()

Xpath or CSS selector get specific node

I need to know if 1324 was Win or Loss in a table. How do I select the single <td> Element to know if it was a loss or a win.
<tr>
<td> 1323 </td>
<td> Won </td>
</tr>
<tr>
<td> 1324 </td>
<td> Loss </td>
</tr>
[...]
<tr>
<td> 1328 </td>
<td> Won </td>
</tr>
Whilst the answers are correct in this question, people are forgetting the context: Selenium. You give those XPath's to it, and it'll blow up in your face.
Selenium expects XPath queries to return physical DOM elements, and not attributes from those elements.
You should find the element, and use Selenium to get it's text. This could be .getText(), or .Text or something similar in whatever language you are using (C# and Java examples below - assuming driver is a valid Driver instance):
C#:
driver.FindElement(By.XPath("//td[text()="1324"]/following-sibling::td")).Text;
Java:
driver.findElement(By.xpath("//td[text()="1324"]/following-sibling::td")).getText();
Try this:
//td[text()='1324']/../td[2]/text()

selenium webdriver xpath generation

I want to create an xpath for clicking on "run " (4th column) based on the first column value (xyz). the below xpath doesnt work. Can you suggest a better way of writing the xpath.
//table/tbody/tr/td[text()='xyz fix']/parent::tr/td[4]
<div id="main">
<table class="FixedLayout" width="1000px">
<tbody>
<tr></tr>
<tr>
<td class="RowHeight">
xyz
</td>
<td>xyz fix</td>
<td>1125</td>
<td>
Run
</td>
</tr>
<tr>
<td class="RowHeight">
abc
</td>
<td>abc fix</td>
<td>1125</td>
<td>
Run
</td>
</tr>
</tbody>
</table>
</div>
I don't see why your one didn't work. Please clarify what it means "doesn't work". NoSuchElementException? ElementNotVisibleException? Wrong XPath? Not clicking the link or what?
Meanwhile, try the following XPaths (but the issue could be your Selenium code instead of XPath):
Here I assume you want to the <a> link instead of <td>, because you mentioned you want to click it.
Use XPath predicate:
//*[#id='main']//table/tobdy/tr[td[text()='xyz']]/td[4]/a
Use XPath predicate with attribute selector to avoid using index.
//*[#id='main']//table/tobdy/tr[td[text()='xyz']]//a[contains(#href, 'Instance/Create')]
Use .. to get the parent
//*[#id='main']//table/tobdy/tr/td[text()='xyz']/../td[4]/a

Resources