I am trying to grab the text "Record No: 1" and the two dates from the following html snippet:
<table class="Report">
<tbody>
<tr>
<td>
<font><b>Record No: 1</b><br>
<i>Original Date</i>: 12/16/2011<br>
<i>Original Entered Date</i>: 12/16/2011
<br>
<br>
</font>
</td>
</tr>
</tbody>
<table>
Using HTMLAgilityPack and the following code I've been able to get the record number but am not sure how to grab the dates.
var recordNum =report.Descendants()
.Where(a=>a.InnerText.Contains("Record No:"))
.Where(a => a.Name == "#text")
.First().InnerText;
Somehow I need to be able to grab the text following the "Original Date" node.
Somehow I need to be able to grab the text following the "Original Date" node.
You can use the following XPath to select text nodes located after i element where inner text equals 'Original Date' :
//i[.='Original Date']/following-sibling::text()
Use the XPath as follow, for example :
var doc = new HtmlDocument();
....
var xpath = "//i[.='Original Date']/following-sibling::text()";
var result = doc.DocumentNode.SelectSingleNode(xpath);
Console.WriteLine(result.InnerText);
Demo
output :
: 12/16/2011
Related
I'm trying to retrieve data from all the cells in the two last columns of a table with an unknown number of rows and columns. I have to do this by VBScript which I'm having difficulties with.
In the example below there are 2 rows with each 5 columns. However in my situation, the number of rows and columns varying.
I would like to get the data in the last two columns, being Year 2015 and Year 2016 in first row and 444 and 555 in second row etc. Since the rows and columns are varying I cannot find a fitting script for retrieving the data.
The data should be listed as variables, as I need to parse them into an input field.
<table id="calculations_data" class="key_figures togglable">
<tr>
<th></th>
<th scope="col">Year 2012</th>
<th scope="col">Year 2013</th>
<th scope="col">Year 2014</th>
<th scope="col">Year 2015</th>
<th scope="col">Year 2016</th>
<th></th>
</tr>
<tr id="turnover_data" class="data_row addaptive">
<th class="title">Turnover</th>
<td>111</td>
<td>222</td>
<td>333</td>
<td>444</td>
<td>555</td>
</tr>
</table>
Furthermore the data is located in a website which I access by this script:
Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True
IE.Navigate "https://data.biq.dk/users/sign_in"
Note: The website above requires login which I manage via my script. The login details cannot be provided.
I have had success with the code below where the data is located in a tag.
Data_CompanyName = IE.document.getElementsByTagName("h1")(0).innerText
Document.getElementByID("company_name").value = Data_CompanyName
We can extract the innerHTML of that table first and parse that innerHTML as we parse XML. Not sure if this is the best approach but it worked for me.
Try this code:
set objIE = CreateObject("internetexplorer.application")
objIE.visible = true
objIE.navigate "https://data.biq.dk/users/sign_in" '<--make sure the correct path is entered here(the one which takes you to that page containing the table)
while objIE.readyState <>4
Wscript.sleep 1000
Wend
set objTable = objIE.document.getElementById("calculations_data")
strxml = objTable.innerhtml
set objXML = CreateObject("Microsoft.XMLDOM")
objXML.async=False
objXML.loadxml strxml
set objRowNodes = objXML.selectnodes("//tr")
for i=1 to objRowNodes.length-1
tempArr = split(objRowNodes(i).text)
msgbox "second last: "&tempArr(ubound(tempArr)-1)&vbcrlf&_ 'Displays the 2nd last column data
"last: "&tempArr(ubound(tempArr)) 'Displays the last column data
next
Output:
when i iterate this below webtable,i am getting row count as 3(with hidden row).
but i can see only 2 rows in my application.
i can get row count with help of descriptive programming,but i want to iterate only the rows which are visible.
<table>
<tbody>
<tr class="show">Name</tr>
<tr class="hide">Ticket</tr>
<tr class="show">city</tr>
</tbody>
</table>
i have tried this below code,but its displays hidden row text as well,
for i=1 to rowcount
print oWebtable.getcelldata(i,2)
next
Actual Output-
Name,
Ticket,
city
expecting output-
Name,
city
UFT has no way knowledge of your show/hide class names. If you want to filter out some rows you need to do it yourself.
Set desc = Description.Create()
desc("html tag").Value = "TR"
desc("class").Value = "show"
Set cells = oWebtable.ChildObjects(desc)
Print "Count: " & cells.Count
For i = 0 To cells.Count - 1
Print i & ": " & cells(i).GetROProperty("inner_text")
Next
Note that I had to add TD elements to your table in order for this to work since it's invalid HTML to have text in a TR element.
I'm trying to parse a value from an HTML table (below) using ruby, watir and regular expressions. I want to parse the id info from the anchor tag if the table rows have specified values. For example, if Event1, Action2 are my target row selections, then my goal is to get the "edit_#" for the table row.
Table:
Event1 | Action2
Event2 | Action3
Event3 | Action4
Example of HTML (I've cut some info out since this is work code, hopefully you get the idea):
<div id="table1">
<table class="table1" cellspacing="0">
<tbody>
<tr>
<tr class="normal">
<td>Event1</td>
<td>Action2</td>
<td>
<a id="edit_3162" blah blah blah… >
</a>
</td>
</tr>
<tr class="alt">
<td> Event2</td>
<td>Action3 </td>
<td>
<a id="edit_3163" " blah blah blah…>
</a>
</td>
</tr>
</tbody>
</table>
</div>
I have tried the following that doesn't really work:
wb=Watir::Browser.new
wb.goto "myURLtoSomepage"
event = "Event1"
action = "Action2"
table = browser.div(:id, "table1")
policy_row = table.trs(:text, /#{event}#{action/)
puts policy_row
policy_id = policy_row.html.match(/edit_(\d*)/)[1]
puts policy_id
This results in this error which is pointing to the policy_id = ...
line : undefined method 'html' for #<Watir::TableRowCollection:0x000000029478f0> (NoMethodError)
Any help is appreciated as I am fairly new to ruby and watir.
Something like this should work:
browser.table.trs.each do |tr|
p tr.a.id if tr.td(:index => 0) == "Event1" and tr.td(:index => 1) == "Action2"
end
This is an alternative to Željko answer. Assuming there is only one row that matches, you can use find instead of each to only iterate through the rows until the first match is found (rather than always going through every row).
#The table you want to work with
table = wb.div(:id => 'table1').table
#Find the row with the link based on its tds
matching_row = table.rows.find{ |tr| tr.td(:index => 0).text == "Event1" and tr.td(:index => 1).text == "Action2" }
#Get the id of the link in the row
matching_row.a.id
This is my HTML:
<table dir = "rtl .......">
<tbody>
<script src = "get.aspx?type=js&file=ajax&rev=3"......>
<script language = "JavaScript" src = "get.aspx?type=js&file=mc&rev=6"></script>
<script>..</script>
<tr>
<td class = "d2"...>..</td>
</tr>
<tr>..</tr> <--
<tr>..</tr> <--
<tr>..</tr> <-- these elements
<tr>..</tr> <--
<tr>..</tr> <--
<tr>..</tr> <--
<tr>..</tr> <--
<tr>
<td class = "d2"...>..</td>
</tr>
<script>..</script>
<tr>..</tr>
<tr>..</tr>
<tr>..</tr>
How would I count or select all <tr> elements between the two <td> elements whose id is d2?
The xpath is going to be a long one so brace yourself:
count(//tr[preceding-sibling::tr/td[#class = 'd2']][count(.|//tr[following-sibling::tr/td[#class = 'd2']])=count(//tr[following-sibling::tr/td[#class = 'd2']])])
To select the actual nodes and not have just the count, simply remove the first count :
//tr[preceding-sibling::tr/td[#class = 'd2']][count(.|//tr[following-sibling::tr/td[#class = 'd2']])=count(//tr[following-sibling::tr/td[#class = 'd2']])]
There are various things happening here notably:
Select start node by selecting preceding sibling's child node tr/td with id='2'
Select end node by selecting following sibling's child node tr/td with id='2'
Use kaycian method : http://www.dpawson.co.uk/xsl/sect2/muench.html#d9940e108 to get the intersection between the two nodes.
I want a loop to dynamically create a table up to 2 columns wide, and then increase the number of rows until there are no entries left in the list. Sounds easy, and I came up with this:
<table>
<tr>
#{ var i = 0; }
#foreach (var tm in Model.TeamMembers)
{
<td>#tm.FirstName #tm.LastName #tm.Role</td>
if(++i % 2 == 0)
{
</tr>
<tr>
}
}
</tr>
</table>
But I get errors stating } expected both for the for loop and the if statement. If I change the tags to something else (like for instance) it works fine.
My guess is it's trying to validate the end of the row, sees it and decides the loop must be over? How can I make it NOT do that, or do I need to put the entire table inside the loop with a bunch of messy conditionals? :(
Try like this:
#{ var i = 0; }
#foreach (var tm in Model.TeamMembers)
{
<td>#tm.FirstName #tm.LastName #tm.Role</td>
if(++i % 2 == 0)
{
<text></tr><tr></text>
}
}
or:
#{ var i = 0; }
#foreach (var tm in Model.TeamMembers)
{
<td>#tm.FirstName #tm.LastName #tm.Role</td>
if(++i % 2 == 0)
{
#:</tr><tr>
}
}
Razor expects the HTML code following your C# code to be enclosed in a pair of html tags. Here you've got the ending tag first, and the starting tag later, that's why razor had trouble parsing the text.
Enclosing your html code block in solves this issue as pointed out by Darin.
You could read this quick guide by Phil Haacked http://haacked.com/archive/2011/01/06/razor-syntax-quick-reference.aspx
In Razor syntax you can use the #: To insert literal text.
#:This allows literal text & arbitrary html
You can see how this is implemented in my solution posted below. It properly closes the table with the right number of table cells per row.
I needed a tabular radio button list with the ability to bind a selected value.
#{
int iSelectedId = (int)ViewData["SelectedMember"];
long iCols = 3;
long iCount = Model.TeamMembers.Count();
long iRemainder = iCount % iCols;
decimal iDiv = iCount / repeatCols;
var iRows = Math.Ceiling(iDiv);
}
<table>
<tr>
#for (int i = 0; i < iCount; i++)
{
var tm = Model.TeamMembers[i];
<td><input type="radio" name="item" value="#(tm.Id)" #(tm.Id == iSelectedId) ? "checked=checked" : "") /> #(tm.FirstName) #(tm.LastName)) - #(tm.Role) </td>
if (i % iCols == iCols -1 && i < iCount - 1)
{
//The alternate syntax for adding arbitrary text/html #:
//was crucial in getting this to work correctly.
#:</tr><tr>
}
}
if (iRem > 0)
{
<td colspan="#(iCols - iRem)"> </td>
}
</tr>
</table>