how to avoid double borders in HTML graphviz - graphviz

I have the following simple
Node in a graph:
digraph "graph.svg" {
graph [bgcolor="#333333" fontcolor=white fontname=Helvetica fontsize=16 label="Title" rankdir=TB]
0 [label=<<TABLE BORDER="0" CELLBORDER="1" CELLSPACING="0" CELLPADDING="2" BGCOLOR="#006699">
<TR>
<TD COLSPAN="2">Node Titel</TD>
</TR>
<TR>
<TD COLSPAN="2">Sieve</TD>
</TR>
<TR>
<TD CELLPADDING="0">
<TABLE BORDER="0" CELLPADDING="0" CELLSPACING="0" BGCOLOR="#006699">
<TR>
<TD BORDER="1">in 1</TD>
</TR>
<TR>
<TD BORDER="1">in 2</TD>
</TR>
</TABLE>
</TD>
<TD CELLPADDING="0">
<TABLE BORDER="0" CELLPADDING="0" CELLSPACING="0" BGCOLOR="#006699">
<TR>
<TD BORDER="1">out 1</TD>
</TR>
<TR>
<TD BORDER="1">out 2</TD>
</TR>
<TR>
<TD BORDER="1">out 3</TD>
</TR>
</TABLE>
</TD>
</TR>
</TABLE>> shape=plaintext]
}
Which produces this output:
How can I make the borders align such that no double borders appear anywhere between the nested tables?
I managed to fiddle around with the CELLSPADING=-1
but I don't think that is the way to go?
I cannot use the COLSPAN option because the inputs and outputs ports are variable in size, that's why I solved this with a nested table for both input and output cells.

you were near there
digraph "graph.svg" {
graph [bgcolor="#333333" fontcolor=white fontname=Helvetica fontsize=16 label="Title" rankdir=TB]
0 [label=<<TABLE BORDER="0" CELLBORDER="1" CELLSPACING="0" CELLPADDING="2" BGCOLOR="#006699">
<TR>
<TD COLSPAN="2">Node Titel</TD>
</TR>
<TR>
<TD COLSPAN="2">Sieve</TD>
</TR>
<TR>
<TD CELLPADDING="0" BORDER="0">
<TABLE BORDER="0" CELLPADDING="0" CELLSPACING="0" BGCOLOR="#006699">
<TR>
<TD BORDER="1">in 1</TD>
</TR>
<TR>
<TD BORDER="1">in 2</TD>
</TR>
</TABLE>
</TD>
<TD CELLPADDING="0" BORDER="0">
<TABLE BORDER="0" CELLPADDING="0" CELLSPACING="0" BGCOLOR="#006699">
<TR>
<TD BORDER="1">out 1</TD>
</TR>
<TR>
<TD BORDER="1">out 2</TD>
</TR>
<TR>
<TD BORDER="1">out 3</TD>
</TR>
</TABLE>
</TD>
</TR>
</TABLE>> shape=plaintext]
}

Related

Graphviz draw tile with different cell colour and length

How to represent this in a dot file?
digraph structs {
node1 [shape=plaintext,
label = <<table border="0" cellspacing="0">
<tr>
<td width="20">0</td>
<td width="20">1</td>
<td width="20">2</td>
<td width="20">3</td>
<td width="20">4</td>
<td width="20">5</td>
<td width="20">6</td>
<td width="20">7</td>
<td width="20">8</td>
<td width="20">9</td>
<td width="20">10</td>
<td width="20">11</td>
<td width="20">12</td>
<td width="20">13</td>
<td width="20">14</td>
</tr>
<tr>
<td border="1" colspan="3" bgcolor="yellow">A</td>
<td border="1" colspan="1" bgcolor="white"></td>
<td border="1" colspan="1" bgcolor="white"></td>
<td border="1" colspan="1" bgcolor="white"></td>
<td border="1" colspan="2" bgcolor="pink">B</td>
<td border="1" colspan="1" bgcolor="white"></td>
<td border="1" colspan="2" bgcolor="green">C</td>
<td border="1" colspan="4" bgcolor="#40e0d0">D</td>
</tr>
</table>>
];
}

Why Firefox incorrectly displays the table with colspan and rowspan?

The table is displayed well in Chrome and Opera, and not at all in Firefox ... How can I fix it?
Screenshot
Code: https://jsfiddle.net/Hamsterman/6zth1mjv/
<table>
<tr>
<td colspan="2" rowspan="2"></td>
<td colspan="2" rowspan="2"></td>
<td colspan="2" rowspan="2"></td>
<td colspan="2" rowspan="2"></td>
<td colspan="2" rowspan="2"></td>
</tr>
<tr></tr>
<tr>
<td colspan="2" rowspan="2"></td>
<td colspan="3" rowspan="3"></td>
<td colspan="3" rowspan="3"></td>
<td colspan="2" rowspan="2"></td>
</tr>
<tr></tr>
<tr>
<td colspan="2" rowspan="2"></td>
<td colspan="2" rowspan="2"></td>
</tr>
<tr>
<td colspan="3" rowspan="3"></td>
<td colspan="3" rowspan="3"></td>
</tr>
<tr>
<td colspan="2" rowspan="2"></td>
<td colspan="2" rowspan="2"></td>
</tr>
<tr></tr>
<tr>
<td colspan="2" rowspan="2"></td>
<td colspan="2" rowspan="2"></td>
<td colspan="2" rowspan="2"></td>
<td colspan="2" rowspan="2"></td>
<td colspan="2" rowspan="2"></td>
</tr>
<tr></tr>
</table>

Loop to scrape multiple elements on the same page while storing them separately

I wish to scrape multiple product names from a single page while using Scrapy
<!-- body_text //-->
<td width="601" valign="top">
<table border="0" width="100%" cellspacing="0" cellpadding="0">
<tr>
<td><img src="images/pixel_trans.gif" border="0" alt="" width="100%" height="10"></td>
</tr>
<tr>
<td class="pageHeading">Pool (Pocket Billiards) Table</td>
</tr>
<tr>
<td><img src="images/pixel_trans.gif" border="0" alt="" width="100%" height="10"></td>
</tr>
<tr>
<td class="main">A Victoria table is more than mere wood and slate. By paying attention to the details - the hidden differences - Victoria tables have become known name as masterpieces of original design and craftmanship, and most prestigious name in billiards.<br><br>
These tables, available in two sizes 9’ X 4.5’ and 8’ X 4’, are made of frames with selected good quality solid wood and finely crafted rose wood legs with Mahagony polish.<br><br>
Slate Beds used are either Indian Bangalore Black Slate or Imported Slate. Slates are covered with worsted wool cloth optionally from Jupiter (China) or Strachan (West of England cloth, U.K.) to have proper speed, accuracy and responsiveness of the table to spin. Chrome nuts and adjusters are used for leveling. It is surrounded with standard imported vulcanized 'L' shaped or 'V' shaped rubber cushions or Northern Cushions (Made in England) to cause billiard balls to rebound while minimizing the lose of kinetic energy.</td>
</tr>
<tr>
<td><img src="images/pixel_trans.gif" border="0" alt="" width="100%" height="10"></td>
</tr>
<tr>
<td>
<table cellpadding="4" cellspacing="0" width="100%" border="0" class="product_box">
<tr>
<td width="50%" valign="top" class="product_name" colspan="2"><strong><a name="vs20b"></a>VS-20B</strong></td>
</tr>
</table>
<table cellpadding="4" cellspacing="4" width="100%" border="0" >
<tr>
<td width="60%" valign="top" class="product_text"><ul><li><strong>Size: 9‘ X 4.5‘</strong></li><li>Rose Wood Legs</li><li>Mahgony Polish</li><li>S.B. Frame</li><li><strong>Bangalore Slate</strong></li><li>Standard Accessories</li></ul></td>
<td width="40%" align="center"><img src="images/products/vs-20b.jpg" alt="VS-20B" border="0" width="250px"></td>
</tr>
</table>
</td>
</tr>
<tr>
<td><img src="images/pixel_trans.gif" border="0" alt="" width="100%" height="10"></td>
</tr>
<tr>
<td>
<table cellpadding="4" cellspacing="0" width="100%" border="0" class="product_box">
<tr>
<td width="50%" valign="top" class="product_name" colspan="2"><strong><a name="vs20b"></a>VS-20C</strong></td>
</tr>
</table>
<table cellpadding="4" cellspacing="4" width="100%" border="0" >
<tr>
<td width="60%" valign="top" class="product_text"><ul><li><strong>Size: 8‘ X 4‘</strong></li><li>Rose Wood Legs</li><li>Mahgony Polish</li><li>S.B. Frame</li><li><strong>Bangalore Slate</strong></li><li>Standard Accessories</li></ul></td>
<td width="40%" align="center"><img src="images/products/vs-20c.jpg" alt="VS-20C" border="0" width="250px"></td>
</tr>
</table>
</td>
</tr>
<tr>
<td><img src="images/pixel_trans.gif" border="0" alt="" width="100%" height="10"></td>
</tr>
<tr>
<td>
<table cellpadding="4" cellspacing="0" width="100%" border="0" class="product_box">
<tr>
<td width="50%" valign="top" class="product_name" colspan="2"><strong><a name="vs23b"></a>VS-23B</strong></td>
</tr>
</table>
<table cellpadding="4" cellspacing="4" width="100%" border="0" >
<tr>
<td width="60%" valign="top" class="product_text"><ul><li><strong>Size: 9‘ X 4.5‘</strong></li><li>Rose Wood Legs</li><li>Mahgony Polish</li><li>S.A.L. Frame</li><li><strong>Imported Slate</strong></li><li>Standard Accessories</li></ul></td>
<td width="40%" align="center"><img src="images/products/vs-23b.jpg" alt="VS-23B" border="0" width="250px"></td>
</tr>
</table>
</td>
</tr>
<tr>
<td><img src="images/pixel_trans.gif" border="0" alt="" width="100%" height="10"></td>
</tr>
<tr>
<td>
<table cellpadding="4" cellspacing="0" width="100%" border="0" class="product_box">
<tr>
<td width="50%" valign="top" class="product_name" colspan="2"><strong><a name="vs23b"></a>VS-23C</strong></td>
</tr>
</table>
<table cellpadding="4" cellspacing="4" width="100%" border="0" >
<tr>
<td width="60%" valign="top" class="product_text"><ul><li><strong>Size: 8‘ X 4‘</strong></li><li>Rose Wood Legs</li><li>Mahgony Polish</li><li>S.A.L. Frame</li><li><strong>Imported Slate</strong></li><li>Standard Accessories</li></ul></td>
<td width="40%" align="center"><img src="images/products/vs-23c.jpg" alt="VS-23C" border="0" width="250px"></td>
</tr>
</table>
</td>
</tr>
<tr>
<td><img src="images/pixel_trans.gif" border="0" alt="" width="100%" height="10"></td>
</tr>
<tr>
<td>
<table cellpadding="4" cellspacing="0" width="100%" border="0" class="product_box">
<tr>
<td width="50%" valign="top" class="product_name" colspan="2"><strong><a name="vs9"></a>VS-9</strong></td>
</tr>
</table>
<table cellpadding="4" cellspacing="4" width="100%" border="0" >
<tr>
<td width="60%" valign="top" class="product_text"><ul><li><strong>Size: 9‘ X 4.5‘</strong></li><li>Auto Ball Return System</li><li>Pro Speed Cloth</li><li>American Pocket Size</li><li>Standard Accessories</li></ul></td>
<td width="40%" align="center"><img src="images/products/vs-9.jpg" alt="VS-9" border="0" width="250px"></td>
</tr>
</table>
</td>
</tr>
<tr>
<td><img src="images/pixel_trans.gif" border="0" alt="" width="100%" height="10"></td>
</tr>
<tr>
<td>
<table cellpadding="4" cellspacing="0" width="100%" border="0" class="product_box">
<tr>
<td width="50%" valign="top" class="product_name" colspan="2"><strong><a name="vs7"></a>VS-7</strong></td>
</tr>
</table>
<table cellpadding="4" cellspacing="4" width="100%" border="0" >
<tr>
<td width="60%" valign="top" class="product_text"><ul><li><strong>PLAYING AREA : 88" X 44"</strong></li><li><strong>OVERALL SIZE : 98"L X 54" W X 31" H</strong></li><li>Solid oak for top/brand rails, Dark cherry finish</li><li>Rams head solid rubber wood with # 6 leather drop pocket. Easy assembly</li></ul></td>
<td width="40%" align="center"><img src="images/products/vs-7.jpg" alt="VS-7" border="0" width="250px"></td>
</tr>
</table>
</td>
</tr>
<tr>
<td><img src="images/pixel_trans.gif" border="0" alt="" width="100%" height="10"></td>
</tr>
<tr>
<td>
<table cellpadding="4" cellspacing="0" width="100%" border="0" class="product_box">
<tr>
<td width="50%" valign="top" class="product_name" colspan="2"><strong><a name="vs8"></a>VS-8/Light Oak</strong></td>
</tr>
</table>
<table cellpadding="4" cellspacing="4" width="100%" border="0" >
<tr>
<td width="60%" valign="top" class="product_text"><ul><li><strong>POOL TABLE : 8‘ X 4‘</strong></li><li><strong>PLAYING AREA : 88" X 44"</strong></li><li><strong>OVERALL SIZE : 98" X 54"W X 31"H</strong></li><li>Solid oak for top/brand rails, Light oak finish</li><li>Rams head solid rubber wood with # 6 leather drop pocket, Easy assembly</li></ul></td>
<td width="40%" align="center"><img src="images/products/vs-8.jpg" alt="VS-8/Light Oak" border="0" width="250px"></td>
</tr>
</table>
</td>
</tr>
<tr>
<td><img src="images/pixel_trans.gif" border="0" alt="" width="100%" height="10"></td>
</tr>
<tr>
<td>
<table cellpadding="4" cellspacing="0" width="100%" border="0" class="product_box">
<tr>
<td width="50%" valign="top" class="product_name" colspan="2"><strong><a name="vs12"></a>VS-12</strong></td>
</tr>
</table>
<table cellpadding="4" cellspacing="4" width="100%" border="0" >
<tr>
<td width="60%" valign="top" class="product_text"><ul><li><strong>POOL TABLE : 8‘ X 4‘</strong></li><li><strong>PLAYING AREA : 88" X 44"</strong></li><li><strong>OVERALL SIZE : 99-3/4"L X 55 - 3/4" W X 31" H</strong></li><li>Black laminate, pedestal legs, with drop pocket, Steel frame Easy assembly. Accessories included.</li></ul></td>
<td width="40%" align="center"><img src="images/products/vs-12.jpg" alt="VS-12" border="0" width="250px"></td>
</tr>
</table>
</td>
</tr>
<tr>
<td><img src="images/pixel_trans.gif" border="0" alt="" width="100%" height="10"></td>
</tr>
<tr>
<td>
<table cellpadding="4" cellspacing="0" width="100%" border="0" class="product_box">
<tr>
<td width="50%" valign="top" class="product_name" colspan="2"><strong><a name="vs10"></a>VS-10</strong></td>
</tr>
</table>
<table cellpadding="4" cellspacing="4" width="100%" border="0" >
<tr>
<td width="60%" valign="top" class="product_text"><ul><li><strong>POOL TABLE : 8‘ X 4‘</strong></li><li><strong>PLAYING AREA : 88" X 44"</strong></li><li><strong>OVERALL SIZE : 98" L X 54"W X 31"H</strong></li><li>Solid oak for top/brand rails, oak finish</li><li>Rams head solid rubber wood with # 6 leather drop pocket, Easy assembly</li></ul></td>
<td width="40%" align="center"><img src="images/products/vs-10.jpg" alt="VS-10" border="0" width="250px"></td>
</tr>
</table>
</td>
</tr>
<tr>
<td><img src="images/pixel_trans.gif" border="0" alt="" width="100%" height="10"></td>
</tr>
<tr>
<td>
<table cellpadding="4" cellspacing="0" width="100%" border="0" class="product_box">
<tr>
<td width="50%" valign="top" class="product_name" colspan="2"><strong><a name="vs11"></a>VS-11</strong></td>
</tr>
</table>
<table cellpadding="4" cellspacing="4" width="100%" border="0" >
<tr>
<td width="60%" valign="top" class="product_text"><ul><li><strong>POOL TABLE : 8‘ X 4‘</strong></li><li><strong>PLAYING AREA : 88" X 44"</strong></li><li><strong>OVERALL SIZE : 100" X 56"</strong></li><li>Solid wood for top/brand rails</li><li>Mahogany finish</li><li>Rams head solid rubber with # 6 leather drop pocket</li></ul></td>
<td width="40%" align="center"><img src="images/products/vs-11.jpg" alt="VS-11" border="0" width="250px"></td>
</tr>
</table>
</td>
</tr>
<tr>
<td><img src="images/pixel_trans.gif" border="0" alt="" width="100%" height="10"></td>
</tr>
<tr>
<td>
<table cellpadding="4" cellspacing="0" width="100%" border="0" class="product_box">
<tr>
<td width="50%" valign="top" class="product_name" colspan="2"><strong><a name="vs13"></a>VS-13</strong></td>
</tr>
</table>
<table cellpadding="4" cellspacing="4" width="100%" border="0" >
<tr>
<td width="60%" valign="top" class="product_text"><ul><li><strong>POOL TABLE : 8‘ X 4‘</strong></li><li><strong>PLAYING AREA : 88" X 44"</strong></li><li><strong>OVERALL SIZE : 100" X 56"</strong></li><li>Solid wood for top/brand rails,</li><li>Dark cherry finish</li><li>Rams head solid rubber wood<br />
<br />
with # 6 leather drop pocket</li></ul></td>
<td width="40%" align="center"><img src="images/products/vs-13.jpg" alt="VS-13" border="0" width="250px"></td>
</tr>
</table>
</td>
</tr>
<tr>
<td><img src="images/pixel_trans.gif" border="0" alt="" width="100%" height="10"></td>
</tr>
<tr>
<td>
<table cellpadding="4" cellspacing="0" width="100%" border="0">
<tr>
<td width="50%" valign="top" class="product_name1" colspan="2"><strong>Standard Accessories for Pool</strong></td>
</tr>
</table>
<table cellpadding="4" cellspacing="4" width="100%" border="0" class="product_box1">
<tr>
<td width="50%" valign="top" class="product_text">
<ul>
<li>Aramith Pool Ball 2.1/4" or 2.1/16"</li>
<li>Table Brush</li>
<li>60" Rest Stick C/W Brass Cross Head Rest</li>
<li>Wall Cue Rack</li>
</ul></td>
<td width="50%" valign="top" class="product_text">
<ul>
<li>Plastic Triangle</li>
<li>Triangle Chalk X 12 Pcs.</li>
<li>Pool House Cue X 4 Pcs.</li>
<li>Table Cover</li>
<li>Round Type Lamp Shade X 2 Pcs.</li>
</ul></td>
</tr>
</table>
</td>
</tr>
</table></td>
<!-- body_text_eof //-->
<td width="45" valign="top">
<table border="0" width="45" cellspacing="0" cellpadding="0">
<!-- right_navigation //-->
As you can see from the code, the are fields which I want to scrape_ which are at the xpath: td[#class='product_name']/strong/a/#name
I also need to pull the images as well from this xpath: rd[#align='center']/a/img/#src
I'm exporting my data in CSV and Currently my scraper stores all the product names in one cell. I'm trying to make it such that it stores each product name and the image URL individually in a single cell in my CSV.
I tried using a loop for this but can't make it to work
My Code:
def parse(self, response):
hxs = HtmlXPathSelector(response)
titles = hxs.select("//head")
items = []
item = item()
for i in range(0,5):
item ["productname"] = titles.select("//td[#class='product_name'][i]/strong").extract()
item ["imgurl"] = titles.select("//td[#align='center'][i]/a/img/#src").extract()
items.append(item)
return(items)
names = hxs.xpath('//td[#class="product_name"]/strong/text()')
imageurls = hxs.xpath('//tr/td[#align="center"]/a/img/#src')
for name, url in zip(names, imageurls):
item["productname"] = name
item["imgurl"] = url
yield item
Simplest way of doing it since the order of the names and image urls would correspond with each other when they are extracted.
You don't need to select the elements one by one (by changing the i index in a loop as you did). The path expression below:
//td[#class='product_name']/strong/a/#name
already returns a node-set containing two items. You just have to loop over the elements that were returned to extract each attribute string.
As for the second expression:
//td[#align='center']/a/img/#src
there is only one match and you could extract the string directly.

How to create a two column email newsletter

I am trying to create a two column email flyer but I'm having trouble with the coding as Outlook hates CSS.
I'm using tables to keep it as simple as possible but I want two separate tables on the left and the right so I can add data into it as I wish.
I tried using float left and right on the two tables but Outlook ignores this style.
I know the two grey tables at the bottom are each in their own separate "holder" tables but this is so I can duplicate the grey "data" tables for when I add new articles.
<table class="all" width="auto" height="auto" border="0" cellspacing="0"><tr><td height="504">
<table width="750" height="140" border="0" cellspacing="0">
<tr>
<td width="200" valign="bottom" bgcolor="#E6E6E6"> </td>
<td width="345" align="center" valign="bottom" bgcolor="#E6E6E6"> </td>
<td width="152" align="center" valign="bottom" bgcolor="#E6E6E6"> </td>
<td width="45" align="center" valign="bottom" bgcolor="#E6E6E6"> </td>
</tr>
<tr>
<td width="200" valign="bottom" bgcolor="#E6E6E6"> </td>
<td align="center" valign="bottom" bgcolor="#E6E6E6"><font color="#111111" face="Arial Narrow" size="+2">DECEMBER NEWSLETTER</font></td>
<td width="152" align="center" valign="bottom" bgcolor="#E6E6E6"><font size="2"><strong>#4 - <span class="orange">04.12.13</span></strong></font></td>
<td width="45" align="center" valign="bottom" bgcolor="#E6E6E6"> </td>
</tr>
</table>
<table width="750" border="0" cellspacing="0" cellpadding="0">
<tr>
<td width="75" height="50" bgcolor="#E6E6E6" scope="row"> </td>
<td width="600" rowspan="2" scope="row"><img src="http://placehold.it/600x200"/></td>
<td width="75" bgcolor="#E6E6E6" scope="row"> </td>
</tr>
<tr>
<td width="75" height="81" scope="row"> </td>
<td scope="row"> </td>
</tr>
</table>
<table class="holder" width="750" border="0" cellspacing="0" cellpadding="0">
<tr>
<td valign="top" scope="row">
<table class="inlinetableleft" width="360">
<tr>
<td width="371" align="left">
<!------------LEFT COLUMN------------------>
<table width="360" border="0" cellspacing="0" cellpadding="0">
<tr>
<th height="103" colspan="4" align="left" valign="middle" bgcolor="#CCCCCC" scope="row"> </th>
</tr>
</table>
<!--------------LEFT COLUMN END------------->
</td>
</tr>
</table>
<table class="inlinetableright" width="360">
<tr>
<td align="left">
<!------------RIGHT COLUMN------------------>
<table width="360" border="0" cellspacing="0" cellpadding="0">
<tr>
<td height="106" align="left" bgcolor="#CCCCCC" scope="row"> </td>
</tr>
</table>
<!-----------RIGHT COLUMN END-------------->
</td></tr>
</table>
</td>
</tr>
</table>
Here is a fiddle of my newsletter so far, it's the bottom two grey tables that I want to be side by side.
Fiddle
For HTML emails, nested tables are your friend :)
JSFiddle
Note: the border around the table is just to show you where the tables are.
<table border="0" width="600" cellpadding="0" cellspacing="0" align="center">
<tr>
<td colspan="2">
header content here
</td>
</tr>
<tr>
<td width="300">
<table border="0" width="300" cellpadding="1" cellspacing="0" align="left">
<tr>
<td>Left Content</td>
</tr>
</table>
</td>
<td width="300">
<table border="0" width="300" cellpadding="1" cellspacing="0" align="left">
<tr>
<td>Right content</td>
</tr>
</table>
</td>
</tr>
</table>

Trying to find XPath for multiple TDs

I want to extract the Address for specific Numbers (the first TD) of this table. The only unique identifier for the table is the H3.
Here is the code for the table:
<table width="95%" cellpadding=5 cellspacing=0 border=1>
<tr><td colspan="4"><h3>The list</td></tr>
<tr>
<td>Number</td><td>First Name</td>
<td>Last Name</td><td>Address</td>
</tr>
I have tried:
//table[#h3=’See this now’]/’tr/td[87] and td[107] and td[116]
I am new to xpath, and programming in general. It's pretty fun, but would love to be able to figure this one out!! Appreciate any help :D
First, your HTML is wrong.
You did not close your Table element.
You did not close your H3 element.
You must enclose your attributes in quotes.
<table width="95%" cellpadding="5" cellspacing="0" border="1">
<tr>
<td colspan="4">
<h3>The list</h3>
</td>
</tr>
<tr>
<td>Number</td>
<td>First Name</td>
<td>Last Name</td>
<td>Address</td>
</tr>
</table>
Once you have fixed the formatting of your XHTML. You can traverse the document tree.
XPATH
Any table, with any td that has a h3.
//table//td/h3
Will return
<h3>The list</h3>
For the number
//table//tr[2]/td[1] <-- any table, the second tr element in this table, the first td in that second tr
Will return
<td>Number</td>
So if we add multiple tables to a document and you want to find multiple results for each element in any table, this is quite simple. Say we have a XHTML document with many tables inside a parent element, for example 'root' element.
<root>
<table width="95%" cellpadding="5" cellspacing="0" border="1">
<tr>
<td colspan="4">
<h3>The list</h3>
</td>
</tr>
<tr>
<td>123</td>
<td>First Name</td>
<td>Last Name</td>
<td>Address</td>
</tr>
</table>
<table width="95%" cellpadding="5" cellspacing="0" border="1">
<tr>
<td colspan="4">
<h3>The list</h3>
</td>
</tr>
<tr>
<td>456</td>
<td>First Name</td>
<td>Last Name</td>
<td>Address</td>
</tr>
</table>
<table width="95%" cellpadding="5" cellspacing="0" border="1">
<tr>
<td colspan="4">
<h3>The list</h3>
</td>
</tr>
<tr>
<td>789</td>
<td>First Name</td>
<td>Last Name</td>
<td>Address</td>
</tr>
</table>
</root>
We can extract the number of the first table data in each second row in every table using the following XPATH expression:
//table/tr[2]/td[1]
This will give us the result of
<td>123</td>
-----------------------
<td>456</td>
-----------------------
<td>789</td>
Now, say we have several tables, but only one table is very important to us, the table must have a H3 element, no other element is important to us, and if this table has a H3 element, we want to extract the second rows first td.
<root>
<table width="95%" cellpadding="5" cellspacing="0" border="1">
<tr>
<td colspan="4">
<h4>Ignore me!</h4>
</td>
</tr>
<tr>
<td>1164961564896</td>
<td>First Name</td>
<td>Last Name</td>
<td>Address</td>
</tr>
</table>
<table width="95%" cellpadding="5" cellspacing="0" border="1">
<tr>
<td colspan="4">
<h1>I'm not interesting</h1>
</td>
</tr>
<tr>
<td>456456466465</td>
<td>First Name</td>
<td>Last Name</td>
<td>Address</td>
</tr>
</table>
<table width="95%" cellpadding="5" cellspacing="0" border="1">
<tr>
<td colspan="4">
<h3>IM THE IMPORTANT TABLE!</h3>
</td>
</tr>
<tr>
<td>123456789</td>
<td>First Name</td>
<td>Last Name</td>
<td>Address</td>
</tr>
</table>
</root>
We can acomplish this by traversing back up the tree if we are successful in finding the H3 element, then go to the next tr.
//table//h3/../../../tr/td[1]
Will return
<td colspan="4">
<h3>IM THE IMPORTANT TABLE!</h3>
</td>
-----------------------
<td>123456789</td>

Resources