This is my HTML:
<tr>
<td bgcolor="ffffff" height="14" width="112"><p class="boldblack"> Price:</p></td>
<td bgcolor="ffffff" width="296"><p class="cena2">9 000 $</p></td>
<td bgcolor="ffffff"></td>
</tr>
I want to take the 9 000
What I have tried
.//p[contains(., 'Price:')]
which gives me the Price: node. Now, how can I reach the 9000 from the Price node?
Note
I can't use XPath like td[2] because I am having a dynamic content. I just know that I will have the price node and their parent's brother will have the 9000 $
Update 1
I can't rely on class either because the structure of the HTML is very bad.
One option would be to simply to rely on the class name (cena from Russian is price):
//p[#class="cena2"]/text()
If you want to rely on the preceding Price: label:
//tr[td[1]/p[contains(., "Price:")]]/td[2]/p/text()
Another option would be to check whether the text ends with $ sign:
//tr/td/p[ends-with(., "$")]/text()
As you see, there are multiple options, it is hard to tell which one is more reliable since you haven't showed the complete HTML code. You can even combine all 3 options I've presented.
Related
I am trying to figure out a way to pull specific values out of a big long text block.
So far I have //td[#class="PadLeft10"] which returns me a big long value starting with the company name and ending with the "View More Info" piece.
I am trying to break my results up into segments, so for example I want my code to look for the words "Primary Contact:" and then return the text that follows that, ending at the <br/>.
I need to get the Company Name, which is always the first bit of text, then the Primary Contact, then the Address, then the Phone and Fax, then the Website, and the Organization type.
The problem is that not every record has all the values. As you can see, the second entry has the address and website, but the first one doesn't.
I am using the Dataminer Chrome Plugin, for anyone familiar with that. It has separate xpath for rows and columns, so I am going to try to make a bunch of different columns that correspond to each of the fields that I am looking for.
Any direction would be greatly appreciated.
<td align="left" valign="top" width="2%">
<script>
if (0 == 1) document.write('<img src="https://website.com" border="0" alt=""/>');
</script>
<br/><br/></td>
<td class="PadLeft10" align="left" valign="top" width="32%" style="padding-left: 15px;">
<span style="font-weight: bold;font-size: 12pt;"><br/>Company Name Here</span><br/>Primary Contact: Mr. Eric Cartman <br/>Phone: (555) 555-5555<br/>Fax: (333) 333-3333<span style="text-decoration: underline;color: #0000ff"></span><br/>Organization Type: Distributor Branch
<br/>
» View More Info<br/>
<br/>
</td>
<td align="left" valign="top" width="2%">
<script>
if (0 == 1) document.write('<img src="https://website.com" border="0" alt=""/>');
</script>
<br/><br/></td>
<td class="PadLeft10" align="left" valign="top" width="32%" style="padding-left: 15px;">
<span style="font-weight: bold;font-size: 12pt;"><br/>Other Company</span><br/>Primary Contact: Mr. Jimmy Valmer<br/>100 N Ohio St 2rd Fl<br/>Rochester, IN 54225<br/>United States<br/>Phone: (888) 888-8888<br/>Fax: (999) 999-9999<span style="text-decoration: underline;color: #0000ff"><br/>Web Site: http://www.companywebsite.com</span><br/>Organization Type: Financial Service
<br/>
» View More Info<br/>
<br/>
</td>
</tr>
<tr>
I am new to xpath, but the least i can say: if you are the creator of the html code, you absolutely need to change it to be more structured
like : Primary Contact:<span id/class='primaryContact'>..</span>
Or else, you can get the elements by this selector (to edit) //td[#class="PadLeft10"]//child::span//following-sibling::text()[1] split by ':' and then proceed, but this solution stay just a diy.
Any direction would be greatly appreciated.
As far as a direction, the sections within table cell that you mention are neither nested DOM items, nor sibling-type DOM nodes. Those are sequential html elements that require special processing.
<br/>Company Name Here</span>
<br/>Primary Contact: Mr. Eric Cartman
<br/>Phone: (555) 555-5555
<br/>...
Both xpath and regex can be leveraged for such a case.
You can select the text node you're looking for using a predicate and the contains function:
//td[#class="PadLeft10"]/text()[contains(., "Primary Contact:")]
Then you can get the substring using the substring-after function:
substring-after(
//td[#class="PadLeft10"]/text()[contains(., "Primary Contact:")],
'Primary Contact:'
)
And remove leading and trailing whitespace using normalize-space:
normalize-space(
substring-after(
//td[#class="PadLeft10"]/text()[contains(., "Primary Contact:")],
'Primary Contact:'
)
)
I have an XML like that:
<tr class="TREven">
<td class="Col0">
<span>
<b>Diary Compliance:</b>
Number of Daily Reports completed
<br/>
<i>
* Must be
<u>24</u>
or more
</i>
</span>
</td>
<td class="Col1">
<span class="Red">4 - Not eligible</span>
</td>
</tr>
I don't know how to select "4 - Not eligible" base on my input text (Diary Compliance: Number of Daily Reports completed * Must be 24 or more) which is contained by many child nodes of span before.
Could you help me?
Thanks,
This is alse my trouble
I often use this xpath to get element on cell 2 base on element on cell 1 in a row table like him:
//span[contains(text(),'Diary Compliance: Number of Daily Reports completed * Must be 24 or more')]/../..//span[contains(text(),'4 - Not eligible')]
It's ok until I get trouble when element on cell 1 has lot of format node, I can't pass my exactly text to element on cell 1. In my case, I must use exactly text, not contains text.
This is my html code:
<tr>
<th class="left_cont"><strong>Hello world</strong></th>
<td class="right_cont padding_left16px"><strong>Hi There</strong></td>
</tr>
Now to select the text Hellow world i used.
//strong[contains(text(),'Hello world')]
Works fine for me.
Now I need to select the text Hi there relatively to the hello world text.
I need to do something like this but I can't figure out.
//strong[contains(text(),'Hello world')]/following-sibling::strong
Doesn't work out for me.
Elements with sibling relations are parent of <strong> instead of <strong> it self, so you can try this way :
//*[strong[contains(.,'Hello world')]]/following-sibling::*[strong]/strong
Or if you are sure parents involved are always <th> and <td> :
//th[strong[contains(.,'Hello world')]]/following-sibling::td[strong]/strong
2nd "strong" element is not actually sibling of the first one. But wrapping "td" elements are siblings. So you could probably use
//strong[contains(text(),'Hello world')]/../following-sibling::td/strong
I have a table that looks like this
<table cellpadding="1" cellspacing="0" width="100%" border="0">
<tr>
<td colspan="9" class="csoGreen"><b class="white">Bill Statement Detail</b></td>
</tr>
<tr style="background-color: #D8E4F6;vertical-align: top;">
<td nowrap="nowrap"><b>Bill Date</b></td>
<td nowrap="nowrap"><b>Bill Amount</b></td>
<td nowrap="nowrap"><b>Bill Due Date</b></td>
<td nowrap="nowrap"><b>Bill (PDF)</b></td>
</tr>
</table>
I am trying to create the XPATH to find this table where it contains the test Bill Statement Detail. I want the entire table and not just the td.
Here is what I have tried so far:
page.parser.xpath('//table[contains(text(),"Bill")]')
page.parser.xpath('//table/tbody/tr[contains(text(),"Bill Statement Detail")]')
Any Help is appreciated
Thanks!
Your first XPath example is the closest in that you're selecting table. The second example, if it ever matched, would select tr—this one will not work mainly because, according to your example, the text you want is in a b node, not a tr node.
This solution is as vague as I could make it, because of *. If the target text will always be under b, change it to descendant::b:
//table[contains(descendant::*, 'Bill Statement Detail')]
This is as specific, given the example, as I can make:
//table[tr[1]/td/b['Bill Statement Detail']]
You might want
//table[contains(descendant::text(),"Bill Statement Detail")]
The suggested codes don't work well if the match word is not in the first row. See the related post Find a table containing specific text
Problem:
Given a table, a specific piece of content should appear in the same column as a specific header.
Clarification:
I can not test the column position numerically, or at least I can't hardcode it that way, since the number of columns can change based on various other conditions and I don't want to make my test that fragile.
Example:
Name || Phone Number || Address
==============================================================
... || ... || ...
Joe || 555-787-7878 || 42 Nowhere Lane, Mulberry, California
... || ... ||
With the code looking like so:
<table>
<tr>
<th>Name</th>
<th>Phone Number</th>
<th>Registered</th>
<th>Address</th>
</tr>
<tr>
...
</tr>
<tr>
<td>Joe</td>
<td>555-377-7347</td>
<td>Yes</td>
<td>42 Nowhere Lane1, Mulberry1, California</td>
</tr>
<tr>
<td>Jerry</td>
<td>555-787-7878</td>
<td>Yes</td>
<td>50 Nowhere Lane, Mulberry, California</td>
</tr>
<tr>
<td>Tom</td>
<td>555-678-0987</td>
<td>No</td>
<td>43 Nowhere Lane2, Mulberry2, California</td>
</tr>
<tr>
...
</tr>
</table>
Scenario:
I want to insure the correct address (42 Nowhere...) appears in the column with the header "Address".
How can I do this?
The solution might be as simple as a decent xpath query, to be honest, perhaps I don't even need anything particularly "Capybara" related!
I came across similar one, but here I need to check whether 'Jerry' registered or not. Please help me how can i automate using ruby/capybara
Thanks in Advance
I think #Snekse is very close. Here are some expressions that have been tested. The following returns the table cell corresponding to the th whose value is Address:
/table/tr/td[(count(../../tr/th[.='Address']/preceding-sibling::*)+1)]
This can be used to get the value of the cell:
42 Nowhere Lane, Mulberry, California
...which you could then use to perform the final test:
/table/tr/td[(count(../../tr/th[.='Address']/preceding-sibling::*)+1)]
[.='42 Nowhere Lane, Mulberry, California']
This will return the cell itself, if the addresses match. It will return an empty node-set, which will evaluate to false in a boolean context, if no such node exists.
You probably also need to specify the row you're testing, which can be done with a slight modification to the start of the expression:
/table/tr[$n]/<rest_of_expression>
...where $n is the index of the row to select.
I would think you would first have to get the position of Address, then use that information to get the value from the data table.
So something like:
var x = count(/table/tr/th[.='Address']/preceding-sibling::*)+1.
var address = /table/tr/td[position()=${x}]
Or combine it into something like:
/table/tr/td[position()=(count(/table/tr/th[.='Address']/preceding-sibling::*)+1.)]
NOTE: I didn't run these statements, so I have no idea if they are valid xpath syntax or not, but it should be close enough to get you there.
REFERENCE: Find position of a node using xPath