How to select text from table cell that includes <br>? - xpath

I have a table that contains rows like this:
<tr class="premium"><td class="name"><div class="name">John Doe</div>Fancy company name<br />Elmstreet 71<br />454378 Ghostown<br />Tel.: 123 4567 891<br /></td></tr>
<tr class="basic"><td class="name"><div class="name">John Smoe</div>Fancy company name<br />Elmstreet 73<br />456378 Ghostown<br />Tel.: 123 4567 891<br /></td></tr>
I need the xpath to select the company name from rows with the class="premium"
Thanks in advance!

xpath as itself returns set of strings divided by <br> tags. You can use string() function to take the 1st past
string(//tr[#class="premium"]/td[#class = "name"]/text())
or as kjhughes has supposed
//tr[#class="premium"]/td[#class = "name"]/text()[1]
result
String='Fancy company name'

Related

Coldfusion, Dropdown menu, Can a single option value contain multiple variables?

I am writing a CFM file to display a dropdown menu, in the dropdown options I want Var1 [Var2, Var3] to display for each line. I tried both concatenation and commas, tried a mix of brackets and quotes and cannot find the right combination. Using Oracle SQL developer. I am a student taking a database management class. I read through Adobe's site and cannot find an example like this. I also combed stack, but everything is PHP and javascript related. The result should have a dropdown, where each selection shows: cus_code with corresponding [cus_fname, cus_lname] for each line. Below is the query that pulls the data needed and the dropdown menu. The place I am having trouble is in the output, where each line from the dropdown should have 3 variables. Every attempt I make causes an error.
<!--Here is my query: grab customer code, last name, first name from customer table, join invoice table on customer code-->
<CFQUERY NAME="INVOICESEARCH" DATASOURCE="ORCL">
SELECT DISTINCT levine04.CUSTOMER6.CUS_CODE, CUS_LNAME, CUS_FNAME
FROM levine04.CUSTOMER6, levine04.INVOICE6
WHERE levine04.CUSTOMER6.CUS_CODE = levine04.INVOICE6.CUS_CODE;
</CFQUERY>
<!--Here is my dropdown:-->
<TR>
<TD ALIGN="right">INV_NUMBER</TD>
<TD>
<INPUT TYPE ="text" NAME="INV_NUMBER" SIZE="10" MAXLENGTH="10">
</TD>
</TR>
<TR>
<TD ALIGN="right">CUS_CODE, [CUS_FNAME, CUS_LNAME]</TD>
<TD>
<SELECT NAME="CUS_CODE" SIZE=1>
<OPTION SELECTED VALUE="ANY">ANY
<CFOUTPUT QUERY="INVOICESEARCH">
<OPTION VALUE="#INVOICESEARCH.CUS_CODE#" + "#INVOICESEARCH.CUS_LNAME#" +"#INVOICESEARCH.CUS_FNAME#"> #CUS_CODE# + #CUSLNAME# + #CUSFNAME#
</CFOUTPUT>
</SELECT>
</TD>
</TR>
Got it! But maybe this will help someone else..
<OPTION VALUE="#INVOICESEARCH.CUS_CODE#"> #CUS_CODE#[#CUS_LNAME#, #CUS_FNAME#]

XPath - Get table if child is not specific string

its posible to do that? Get all table "tr"s except the tr that have an elemente with an especific string.
Example:
<div class="span5">
<table class="table">
<tbody>
<tr>
<th>Apple</th>
<td>Red</td>
</tr>
<tr>
<th>Banana</th>
<td>Yellow</td>
</tr>
<tr>
<th>Potato</th>
<td>Brown</td>
</tr>
</tbody>
</table>
</div>
Simple example, a table with 2 columns, I can select the table with the next Xpath:
//div[#class='span5']/table[#class='table']
But its posible to select the table WITHOUT the "tr" that contains:
//th[.='Potato']
Im usualling solving that problem geting all the table and then filter "tr" contents in Python, but I want to filter with XPath and optimize a bit my code without charge it in memory.
Thanks
Your XPath can be a bit simpler, like so :
//div[#class='span5']/table[#class='table']//tr[th != 'Potato']

HtmlAgilityPack algorithm question

I’m using HtmlAgilityPack to obtain some Html from a web site.
Here is the received Html:
<table class="table">
<tr>
<td>
<table class="innertable">...</table>
</td>
</tr>
<tr>
<td colspan="2"><strong>Contact</strong></td>
</tr>
<tr>
<td colspan="2">John Doe</td>
</tr>
<tr>
<td colspan="2">Jane Doe</td>
</tr>
<tr>
<td colspan="2"> </td>
</tr>
<tr>
<td><strong>Units</strong></td>
<td>32</td>
</tr>
<tr>
<td><strong>Year</strong></td>
<td>1998</td>
</tr>
</table>
The Context:
I’m using the following code to get the first :
var table = document.DocumentNode.SelectNodes("//table[#class='table']").FirstOrDefault();
I’m using the following code to get the inner table :
var innerTable = table.SelectNodes("//table[#class=innertable]").FirstOrDefault();
So far so good!
I need to get some information from the first table and some from the inner table.
Since I begin with the information from the first table I need to skip the first row (which holds the inner table) so I do the following:
var tableCells = table.SelectNodes("tr[position() > 1]/td");
Since I now have all the cells from the first table excluding the inner table, I start doing the following:
string contact1 = HttpUtility.HtmlDecode(tableCells[1].InnerHtml);
string contact2 = HttpUtility.HtmlDecode(tableCells[2].InnerHtml);
string units = HttpUtility.HtmlDecode(tableCells[5].InnerHtml);
string years = HttpUtility.HtmlDecode(tableCells[7].InnerHtml);
The problem:
I’m getting the values I want by hardcoding the index in tableCells[] not thinking the layout would move…unfortunately, it does move.
In some cases I do not have a “Jane Doe” row (as shown in the above Html), this means I may or may not have two contacts.
Because of this, I can’t hardcode the indexes since I might end up having the wrong data in the wrong variables.
So I need to change my approach...
Does anyone know how I could perfect my algorithm so that it can take into account the fact that I may have one or two contacts and perhaps not use hardcoded indexes?
Thanks in advance!
vlince
There is never one unique solution to this kind of problem. Here is an XPATH that seems to do some kind of it though:
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.Load(yourHtmlFile);
doc.Save(Console.Out);
foreach (HtmlNode node in doc.DocumentNode.SelectNodes("//tr[td/strong/text() = 'Contact']/following-sibling::tr/td/text()[. != ' ']"))
{
Console.WriteLine(node.OuterHtml);
}
will display this:
John Doe
Jane Doe
32
1998

Using jQuery Validate to ensure there is at least one row in my table

I am using a table in an entry program to allow the user to add one or more rows of information (much like this article).
I need to ensure that there is at least one row in this table. Google is not really turning much up for me on people doing this. Can anyone give me direction on this? Can I do a count based on a class name?
Here is the layout of my table:
<table id="editorRows">
...
<tbody class="editorRow">
<tr class="row1">
</tr>
<tr class="row2" style="display: none;">
</tr>
<tr class="row3" style="display:none;">
</tr>
</tbody>
</table>
A "row" in this case is the tag. Row 2 and 3 get dynamically showen based on options in row 1.
you can use $("#editorRows tr").length > 0

xpath: how to not match the following data

Viewing source, I have table data formatted exactly like this:
<tr class="even">
<td>apple</td>
<td>pear</td>
<td>orange</td>
</tr>
<tr class="odd">
<td>apple</td>
<td>pear</td>
<td>&nbsp</TD>
</tr>
<tr class="even">
<td>apple</td>
<td>pear</td>
<td>orange</td>
</tr>
How would I go about not matching the <td> containing &nbsp in all rows where it occurs?
The entity isn't something that XPath knows about -- it is best to use its equivalent (self-defining) character entity  
To select all td s of a top element - table, that do not contain use:
/table/tr/td[not(contains(., ' '))]
To select all rows of this table such that none of their td children contains use:
/table/tr[not(td[contains(., ' ')])]
To select all td children of all rows of this table such that none of their td children contains use:
/table/tr[not(td[contains(., ' ')])]/td

Resources