How to show few lines with jstl into html td tag? - jstl

All,
I have simple print in my website, using JSTL into a td html tag, very simple print, but I would like to show just few lines, maybe the last 5, is there a way to do this?
See below:
<td class="domain"><c:out value="${row.domain}" escapeXml="false" /></td>

Related

Scrapy - Scraping hidden elements

I think what I want to ask if it's possible to get around sql:hide (https://learn.microsoft.com/en-us/sql/relational-databases/sqlxml-annotated-xsd-schemas-using/hiding-elements-and-attributes-by-using-sql-hide?view=sql-server-2017), but I've described my actual problem below in case I'm mistaken:
I'm trying to scrape the "foo" urls from a website with a DOM similar to the following:
<html>
<body>
<tbody>
<tr>
...
...
</tr>
</tbody>
<table>
<tbody>
<tr>
...
</tr>
<tr>
...
</tr>
</tbody>
</table>
</body>
</html>
Whenever I try print(response.css('a')) or equivalently print(response.xpath('//a')), I can see the "foo" urls, but not the "bar" urls. Additionally, using XPath I can access up to the table, but print(response.xpath('//table//*')) and print(response.xpath('//table//a')) both output [].
Could it be possible that the elements of table have been hidden from Scrapy somehow? How would one resolve this?
Thanks in advance. This is mainly for interest as the urls have a predictable pattern anyway.
I know that this is just a wild guess, but you can try
//a[starts-with(#href,'foo')]/text()
This should give you the text values of all a tags which have a href attribute which value starts with the string 'foo'.
But it could be possible that some parts of the result XML/HTML are loaded by JavaScript at a later time what would explain your difficulties locating certain elements.

Xpath - Exclude elements within TD

I'm trying to use Chrome's scraper extension using XPath. I've been able to scrape everything I need from a table, but I'm stuck in one spot. Here's the source
<td>
<p class="pClass">
<a href="theurl" target="_blank">
<i class="iClass">someText</i>
Anchor text
</a>
</p>
</td>
I'm trying to grab just the URL, but when using my Xpath code as td[9]/p/a it grabs the icon part that says "someText". Is there a way to just grab the URL?
In order to extract url just add #href to your xpath expression, this should work: //td[9]/p/a/#href.
For stripping white space you can use xpath function normalize-space().

Excel VBA to Deal With AJAX

I am practicing to use excel vba to download information from website: http://mops.twse.com.tw/mops/web/t05sr01_1
But I have no idea how to download the data behind click button, as the image shown: http://i.stack.imgur.com/KZHiZ.jpg
I excerpt its web code as below. Could anyone explain me how to code in excel vba to get its data?
Thank you very mush.
Web code:
<td style='text-align:left !important;' nowrap>鴻海</td>
<td style='text-align:left !important;'>105/01/05</td>
<td style='text-align:left !important;'>11:41:00</td>
<td style='text-align:left !important;'>說明媒體報導</td>
<td><input type='button' value='詳細資料' onclick="document.fm_t05sr01_1.SEQ_NO.value='1';document.fm_t05sr01_1.SPOKE_TIME.value='114100';document.fm_t05sr01_1.SPOKE_DATE.value='20160105';document.fm_t05sr01_1.COMPANY_NAME.value='?E??';document.fm_t05sr01_1.COMPANY_ID.value='2317';document.fm_t05sr01_1.skey.value='2317201601051';document.fm_t05sr01_1.hhc_co_name.value='?E??';ajax1(this.form,'table01');">
You haven't shown how you are getting the html.
You can use a CSS selector.
General for first input button
input[type=button]
This says element(s) with input tag having attribute type whole value is 'button'
You apply with the querySelector method, or querySelectorAll if more than one match and then use index for required element.
ie.document.querySelector("input[type=button]").Click
If in an HTMLDocument variable e.g. htmlDoc then
htmlDoc.querySelector("input[type=button]").Click

Nokogiri: Finding all tags in a direct path, not including arbitrary levels of nesting

Say I have an html document like:
<div id='findMe'>
<table>
<tr>
<td>
<p>
bad
</p>
</td>
</tr>
</table>
<p>
This is some text and this is a link
</p>
</div>
I want to capture all links instead the div #findMe, inside paragraphs tags, but not inside table or any other tags. So, I want the one labeled "good", but not the one labeled "bad". I'm trying:
Nokogiri::HTML(html).css('#findMe p a')
but that's capturing both links. I also tried a more explicit xpath:
Nokogiri::HTML(html).css('#findMe').xpath('//p/a')
But that's doing the same thing. How can I tell Nokogiri to only search a specific path down the tree?
Use > in CSS to select immediate descendant.
Nokogiri::HTML(html).css('#findMe > p > a')
Or use / in xpath:
Nokogiri::HTML(html).xpath("//div[#id='findMe']/p/a")
Figured out a way to do it, but I'm still not too comfortable with xpaths so if this isn't the best way feel free to post the more canonical way to achieve this.
Nokogiri::HTML(html).css(#findMe').xpath('//div/p/a')

Select table in Mechanize (ruby)

I try with my internet-bot to get infos from a table on a website.
The table has just "map_table" as id (CSS attribute) tr has "map_tr" and for a cell it's "map_td".
I want to detect the cells with a link containing "msg.php" in their href.
Ex :
<td id="map_td">
</td>
This one has not to be selected
<td id="map_td">
</td>
This one has to be selected. I have searched in the Mechanize doc, in forums, I haven't found anything.
Can you help me ?
That should be:
page.search('td:has(a[href*="msg.php"])')
It's the Nokogiri docs that you want look at, but really, the CSS or XPath specs if you're not familiar with either of them.

Resources