using xpath to find Href in selenium webdriver - xpath

i have to find the following href using selenium in java
<tr>
<td>
<a target="mainFrame" href="reb.php?tiEx=ES"></a>
</td>
</tr>
thanks

There are multiple ways to find the link depending on the elements it is inside and the uniqueness of the element attribute values on the page, but, from what you've provided, you can rely on the target attribute:
//a[#target="mainFrame"]
You can also narrow it down to the scope of it's parents:
//tr/td/a[#target="mainFrame"]
Also, you can additionally check the href attribute if it is persistent and reliable:
//tr/td/a[#target="mainFrame" and #href="reb.php?tiEx=ES"]

Related

Scrapy - Scraping hidden elements

I think what I want to ask if it's possible to get around sql:hide (https://learn.microsoft.com/en-us/sql/relational-databases/sqlxml-annotated-xsd-schemas-using/hiding-elements-and-attributes-by-using-sql-hide?view=sql-server-2017), but I've described my actual problem below in case I'm mistaken:
I'm trying to scrape the "foo" urls from a website with a DOM similar to the following:
<html>
<body>
<tbody>
<tr>
...
...
</tr>
</tbody>
<table>
<tbody>
<tr>
...
</tr>
<tr>
...
</tr>
</tbody>
</table>
</body>
</html>
Whenever I try print(response.css('a')) or equivalently print(response.xpath('//a')), I can see the "foo" urls, but not the "bar" urls. Additionally, using XPath I can access up to the table, but print(response.xpath('//table//*')) and print(response.xpath('//table//a')) both output [].
Could it be possible that the elements of table have been hidden from Scrapy somehow? How would one resolve this?
Thanks in advance. This is mainly for interest as the urls have a predictable pattern anyway.
I know that this is just a wild guess, but you can try
//a[starts-with(#href,'foo')]/text()
This should give you the text values of all a tags which have a href attribute which value starts with the string 'foo'.
But it could be possible that some parts of the result XML/HTML are loaded by JavaScript at a later time what would explain your difficulties locating certain elements.

Excel VBA to Deal With AJAX

I am practicing to use excel vba to download information from website: http://mops.twse.com.tw/mops/web/t05sr01_1
But I have no idea how to download the data behind click button, as the image shown: http://i.stack.imgur.com/KZHiZ.jpg
I excerpt its web code as below. Could anyone explain me how to code in excel vba to get its data?
Thank you very mush.
Web code:
<td style='text-align:left !important;' nowrap>鴻海</td>
<td style='text-align:left !important;'>105/01/05</td>
<td style='text-align:left !important;'>11:41:00</td>
<td style='text-align:left !important;'>說明媒體報導</td>
<td><input type='button' value='詳細資料' onclick="document.fm_t05sr01_1.SEQ_NO.value='1';document.fm_t05sr01_1.SPOKE_TIME.value='114100';document.fm_t05sr01_1.SPOKE_DATE.value='20160105';document.fm_t05sr01_1.COMPANY_NAME.value='?E??';document.fm_t05sr01_1.COMPANY_ID.value='2317';document.fm_t05sr01_1.skey.value='2317201601051';document.fm_t05sr01_1.hhc_co_name.value='?E??';ajax1(this.form,'table01');">
You haven't shown how you are getting the html.
You can use a CSS selector.
General for first input button
input[type=button]
This says element(s) with input tag having attribute type whole value is 'button'
You apply with the querySelector method, or querySelectorAll if more than one match and then use index for required element.
ie.document.querySelector("input[type=button]").Click
If in an HTMLDocument variable e.g. htmlDoc then
htmlDoc.querySelector("input[type=button]").Click

Select table in Mechanize (ruby)

I try with my internet-bot to get infos from a table on a website.
The table has just "map_table" as id (CSS attribute) tr has "map_tr" and for a cell it's "map_td".
I want to detect the cells with a link containing "msg.php" in their href.
Ex :
<td id="map_td">
</td>
This one has not to be selected
<td id="map_td">
</td>
This one has to be selected. I have searched in the Mechanize doc, in forums, I haven't found anything.
Can you help me ?
That should be:
page.search('td:has(a[href*="msg.php"])')
It's the Nokogiri docs that you want look at, but really, the CSS or XPath specs if you're not familiar with either of them.

Selenium IDE and xpath - find text / row in table and select radio box

I've been using Selenium IDE and getting some good results. I've done a lot of reading about following-sibling and preceding-sibling but I can't locate the right radio button.
Essentially I want to find the row in a table with the word 'testing' and then click the radio button in the cell.
So far I can find the input button
//input[#type='radio']
and find the text testing
//a[contains(text(),'testing')]
I've been trying to use this in the ide
check | //input[#type='radio']/following-sibling::td[1]/a[contains(text(),'testing')]
but I get the error [error] locator not found: //input[#type='radio']/following-sibling::a[contains(text()[1],'testing')]
Any help to change this is really appreciated :)
Cheers
Damien
here's the bare basic table ...
<tbody id="list">
<tr>
<th>
<label class="radio">
<input class="presentation_radio" type="radio" value="1" name="presentation_radio">
</label>
</th>
<td>
testing
</td>
<td>testing</td>
<td>Joe Acme</td>
<td>Presentation</td>
<td>03 May 2012</td>
<td>5 (1)</td>
</tr>
</tbody>
The problem with your xpath is that td and input are not sibling (they don't have common parent) and even if you change your xpath to more correct version:
//input[#type='radio']/following::td[1]/a[contains(text(),'testing')]
it will find a that have preceding checkbox instead of checkbox itself. So correct xpath will be:
//a[contains(text(),'testing')]/preceding::input[#type='radio'][1]
or
//tr[descendant::a[contains(.,'testing')]]//input[#type='radio']
For xpath axis tutorial read this: http://msdn.microsoft.com/en-us/library/ms256456.aspx

Xpath query to find elements which contain a certain descendant

I'm using Html Agility Pack to run xpath queries on a web page. I want to find the rows in a table which contain a certain interesting element. In the example below, I want to fetch the second row.
<table name="important">
<tr>
<td>Stuff I'm NOT interested in</td>
</tr>
<tr>
<td>Stuff I'm interested in</td>
<td><interestingtag/></td>
<td>More stuff I'm interested in</td>
</tr>
<tr>
<td>Stuff I'm NOT interested in</td>
</tr>
<tr>
<td>Stuff I'm NOT interested in</td>
</tr>
</table>
I'm looking to do something like this:
//table[#name='important']/tr[has a descendant named interestingtag]
Except with valid xpath syntax. ;-)
I suppose I could just find the interesting element itself and then work my way up the parent chain from the node that's returned, but it seemed like there ought to be a way to do this in one step and I'm just being dense.
"has a descendant named interestintag" is spelled .//interestintag in XPath, so the expression you are looking for is:
//table[#name='important']/tr[.//interestingtag]
Actually, you need to look for a descendant, not a child:
//table[#name='important']/tr[descendant::interestingtag]
I know this isn't what the OP was asking, but if you wanted to find an element that had a descendant with a particular attribute, you could do something like this:
//table[#name='important']/tr[.//*[#attr='value']]
I know it is a late answer but why not going the other way around. Finding all <interestingtag/> tags and then select the parent <tr> tag.
//interestingtag/ancestor::tr

Resources