Indentation graphviz dot in case of HTML like labels - graphviz

I have a, generated, UML diagram and render this with dot (tests are done with version 7.0.5), the results are OK nut when I want to split the lines in the graph the rendering is not as I like it.
I tried to get the results with the HTML like labels in dot but this looks worse (I know that the dot manual, https://graphviz.org/doc/info/shapes.html#html, states it has only limited possibilities).
The original (node2) and adjusted / test (node3) code:
digraph "Class" {
bgcolor="transparent";
edge [fontname=Helvetica,fontsize=10,labelfontname=Helvetica,labelfontsize=10];
node [fontname=Helvetica,fontsize=10,shape=record,height=0.2,width=0.8];
Node2 [label="{FClass\n|# m_trackingEnabled\l|+ FClass()\l+ InsertRenderMaterial\lInsertRenderMaterial()\l+ InsertRenderMaterialDelete\lInsertMaterial()\l+ setTrackingEnabled()\l}",
height=0.2,width=0.4,color="grey75", fillcolor="white", style="filled"];
Node3 [label=<
<TABLE CELLBORDER="0" BORDER="0">
<TR><TD COLSPAN="2" CELLPADDING="1" CELLSPACING="0">FClass</TD></TR>
<HR/>
<TR><TD VALIGN="top" CELLPADDING="1" CELLSPACING="0">#</TD><TD VALIGN="top" ALIGN="left" CELLPADDING="1" CELLSPACING="0">m_trackingEnabled</TD></TR>
<HR/>
<TR><TD VALIGN="top" CELLPADDING="0" CELLSPACING="0">+</TD><TD VALIGN="top" ALIGN="left" CELLPADDING="0" CELLSPACING="0">FClass()</TD></TR>
<TR><TD VALIGN="top" CELLPADDING="0" CELLSPACING="0">+</TD><TD VALIGN="top" ALIGN="LEFT" CELLPADDING="0" CELLSPACING="0">InsertRenderMaterial<BR ALIGN="LEFT"/>InsertRenderMaterial()</TD></TR>
<TR><TD VALIGN="top" CELLPADDING="0" CELLSPACING="0">+</TD><TD VALIGN="top" ALIGN="left" CELLPADDING="0" CELLSPACING="0">InsertRenderMaterialDelete<BR ALIGN="LEFT"/>InsertMaterial()</TD></TR>
<TR><TD VALIGN="top" CELLPADDING="0" CELLSPACING="0">+</TD><TD VALIGN="top" ALIGN="left" CELLPADDING="0" CELLSPACING="0">setTrackingEnabled()</TD></TR>
</TABLE>
>,height=0.2,width=0.4,color="grey75", fillcolor="white", style="filled"];
}
the resulting image (left original, right adjusted / test):
Problems:
original
in case of splitting the line (second and third "Insert") the new line starts at the beginning. This is logical as we have just one string, but far from nice in our case.
adjusted
indentation of splitted lines is depending on the length
(minor) horizontal ruler should be stretch full box
Note:
I used here Helvetica as font, but the user can choose any font he likes so just adding some spaces in the original doesn't give the right result (unless....)
the initial character of the description will be one of +, #, -, *, ~.
Any solution / suggestions for improvement?

Small changes to both record & html versions:
to record:
added 3 non-breaking spaces ( ) after \l line break to shift right (align)
to html:
added <BR ALIGN="LEFT"/> to the end of every line segment (seems to apply to preceeding text)
set html node shape to plain to extend HR to edges
set cellpadding="1" (just because)
digraph "Class" {
bgcolor="transparent";
edge [fontname=Helvetica,fontsize=10,labelfontname=Helvetica,labelfontsize=10];
node [fontname=Helvetica,fontsize=10,shape=record,height=0.2,width=0.8];
//
// added 3 non-breaking spaces ( ) after \l line break to shift right (align)
//
Node2 [label="{FClass\n|# m_trackingEnabled\l|+ FClass()\l+ InsertRenderMaterial\l InsertRenderMaterial()\l+ InsertRenderMaterialDelete\l InsertMaterial()\l+ setTrackingEnabled()\l}",
height=0.2,width=0.4,color="grey75", fillcolor="white", style="filled"];
//
// added <BR ALIGN="LEFT"/> to the end of every line segment (seems to apply to preceeding text)
// set html node shape to plain to extend HR to edges
// set cellpadding="1" (just because)
//
Node3 [shape=plain label=<
<TABLE CELLBORDER="0" BORDER="0" CELLPADDING="1" >
<TR><TD COLSPAN="2" CELLPADDING="1" CELLSPACING="0">FClass</TD></TR>
<HR/>
<TR><TD VALIGN="top" CELLPADDING="1" CELLSPACING="0">#</TD><TD VALIGN="top" ALIGN="left" CELLPADDING="1" CELLSPACING="0">m_trackingEnabled</TD></TR>
<HR/>
<TR><TD VALIGN="top" CELLSPACING="0">+</TD><TD VALIGN="top" ALIGN="left" CELLSPACING="0">FClass()</TD></TR>
<TR><TD VALIGN="top" CELLSPACING="0">+</TD><TD VALIGN="top" ALIGN="LEFT" CELLSPACING="0">InsertRenderMaterial<BR ALIGN="LEFT"/>InsertRenderMaterial()<BR ALIGN="LEFT"/></TD></TR>
<TR><TD VALIGN="top" CELLSPACING="0">+</TD><TD VALIGN="top" ALIGN="left" CELLSPACING="0">InsertRenderMaterialDelete<BR ALIGN="LEFT"/>InsertMaterial()<BR ALIGN="LEFT"/></TD></TR>
<TR><TD VALIGN="top" CELLSPACING="0">+</TD><TD VALIGN="top" ALIGN="left" CELLSPACING="0">setTrackingEnabled()</TD></TR>
</TABLE>
>,height=0.2,width=0.4,color="grey75", fillcolor="white", style="filled"];
}
Giving:

Related

How can I create a table like this with Graphviz (Table with arrow connecting rows)

I need to make a table with arrows pointing from one row to another.
On the left hand side, the arrows are pointing from the top to bottom and vice versa on the right hand side.
This table will be used for documentation and my plan is to generate the .dot file programmatically using python. The original table in the diagram below was generated using Latex, but the server which will be running this flow does not have any pdf-latex converters installed.
I tried the sample code below on graphic but its not giving the desired output.
digraph {
graph [pad="0.5", nodesep="0.5", ranksep="2", splines="ortho"];
node [shape=plain]
rankdir=LR;
Foo [label=<
<table border="0" cellborder="1" cellspacing="0">
<tr> <td><i>Input Foo</i></td> <td> two </td> </tr>
<tr> <td port="1">one</td> <td> two </td></tr>
<tr> <td port="2">two</td> <td> two </td></tr>
<tr> <td port="3">three</td> <td> two </td></tr>
<tr> <td port="4">four</td> <td> two </td></tr>
<tr> <td port="5">five</td> <td> two </td></tr>
<tr> <td port="6">six</td> <td> two </td></tr>
</table>>];
Foo:3 -> Foo:2;
Foo:3 -> Foo:6;
Foo:6 -> Foo:1;
}
Congratulations, you found a bug in the "ortho" implementation.
If you remove the splines attribute, or change the value, you will get edges to-from the correct ports.
If you then add directional ports (n, s, e, w, ...) the edge connections make sense.
But the edges will be loops, not the squared-off edges you desire. (see below).
It might be possible to "roll-your-own" edges using straight lines and two invisible points for each edge. But that would be a double hassle since a table is involved.
You could also output to the "dot" format, post-process that to create squared-off edges, and then run that through neato -n2, but again some real hassle.
digraph {
graph [pad="0.5", nodesep="0.5", ranksep="2" ] // splines=ortho]
node [shape=plain]
// rankdir=LR; // makes a very small difference
Foo [label=<
<table border="0" cellborder="1" cellspacing="0">
<tr> <td><i>Input Foo</i></td><td> two </td> </tr>
<tr> <td port="1">one</td><td> two </td></tr>
<tr> <td port="2">two</td><td> two </td></tr>
<tr> <td port="3">three</td><td> two </td></tr>
<tr> <td port="4">four</td><td> two </td></tr>
<tr> <td port="5">five</td><td> two </td></tr>
<tr> <td port="6">six</td><td> two </td></tr>
</table>>];
Foo:3:w -> Foo:2:w;
Foo:3:w -> Foo:6:w;
Foo:6:w -> Foo:1:w;
}

Graphs and subgraphs layout

Using graphviz I want to generate the following graph (manually written in tikz for now):
I currently succeeded in writing a dot file giving me the following result (there is a single node using html for the Pile table on the top, and one node also using html for every grey box):
But unfortunately I was not able to keep the layout of grey boxes that I like AND to have the grey boxes on the right of the Pile table. I read carefully this related question and tried things with rank, subgraphs and rankdir without success.
Is there any hope to reach my goal with dot ?
EDIT: here is the complete dot file I have currently
digraph structs {
node [shape=plaintext]
subgraph stack {
label = STACK;
stack [label=<
<TABLE BORDER="1" CELLBORDER="0" CELLSPACING="0" BGCOLOR="white">
<TR><TD><i>Nom</i></TD><TD><i>Type</i></TD><TD><i>Portée</i></TD><TD><i>Valeur</i></TD></TR>
<TR><TD BGCOLOR="chartreuse">list2</TD><TD BGCOLOR="chartreuse">référence</TD><TD BGCOLOR="chartreuse">main</TD><TD PORT="port_140407657518560" BGCOLOR="chartreuse">0x7f
b334a23ac0</TD></TR>
<TR><TD BGCOLOR="chartreuse">list1</TD><TD BGCOLOR="chartreuse">référence</TD><TD BGCOLOR="chartreuse">main</TD><TD PORT="port_140407657518032" BGCOLOR="chartreuse">0x7f
b334a23ac0</TD></TR>
</TABLE>
>
];
}
subgraph heap {
label = HEAP;
struct_140407658920640 [label=<
<TABLE BORDER="0" CELLBORDER="1" CELLSPACING="0" BGCOLOR="gray">
<TR><TD COLSPAN="6">0x7fb334a23ac0</TD></TR>
<TR><TD COLSPAN="6"><u>list</u></TD></TR>
<TR><TD PORT="port_child0">0x955e80</TD>
<TD PORT="port_child1">0x956360</TD>
<TD PORT="port_child2">0x956040</TD>
<TD PORT="port_child3">0x7fb3348cbbb0</TD>
<TD PORT="port_child4">0x7fb3348cbc30</TD>
<TD PORT="port_child5">0x7fb3348cbc70</TD>
</TR>
</TABLE>>];
struct_9789056 [label=<
<TABLE BORDER="0" CELLBORDER="1" CELLSPACING="0" BGCOLOR="gray">
<TR><TD>0x955e80</TD></TR>
<TR><TD><u>int</u></TD></TR>
<TR><TD>3</TD></TR>
</TABLE>>];
struct_140407658920640:port_child0 -> struct_9789056;
struct_9790304 [label=<
<TABLE BORDER="0" CELLBORDER="1" CELLSPACING="0" BGCOLOR="gray">
<TR><TD>0x956360</TD></TR>
<TR><TD><u>int</u></TD></TR>
<TR><TD>42</TD></TR>
</TABLE>>];
struct_140407658920640:port_child1 -> struct_9790304;
struct_9789504 [label=<
<TABLE BORDER="0" CELLBORDER="1" CELLSPACING="0" BGCOLOR="gray">
<TR><TD>0x956040</TD></TR>
<TR><TD><u>int</u></TD></TR>
<TR><TD>17</TD></TR>
</TABLE>>];
struct_140407658920640:port_child2 -> struct_9789504;
struct_140407657511856 [label=<
<TABLE BORDER="0" CELLBORDER="1" CELLSPACING="0" BGCOLOR="gray">
<TR><TD>0x7fb3348cbbb0</TD></TR>
<TR><TD><u>str</u></TD></TR>
<TR><TD>"go"</TD></TR>
</TABLE>>];
struct_140407658920640:port_child3 -> struct_140407657511856;
struct_140407657511984 [label=<
<TABLE BORDER="0" CELLBORDER="1" CELLSPACING="0" BGCOLOR="gray">
<TR><TD>0x7fb3348cbc30</TD></TR>
<TR><TD><u>str</u></TD></TR>
<TR><TD>"feu"</TD></TR>
</TABLE>>];
struct_140407658920640:port_child4 -> struct_140407657511984;
struct_140407657512048 [label=<
<TABLE BORDER="0" CELLBORDER="1" CELLSPACING="0" BGCOLOR="gray">
<TR><TD>0x7fb3348cbc70</TD></TR>
<TR><TD><u>str</u></TD></TR>
<TR><TD>"partez"</TD></TR>
</TABLE>>];
struct_140407658920640:port_child5 -> struct_140407657512048;
}
stack:port_140407657518560 -> struct_140407658920640;
stack:port_140407657518032 -> struct_140407658920640;
}
Minor changes
changed subgraphs to clusters (see https://www.graphviz.org/pdf/dotguide.pdf)
added invisible edge to drive cluster positioning
digraph structs {
node [shape=plaintext]
subgraph cluster_stack { // changed to cluster
graph [peripheries=0] // no box around this cluster
label = STACK;
stack [label=<
<TABLE BORDER="1" CELLBORDER="0" CELLSPACING="0" BGCOLOR="white">
<TR><TD><i>Nom</i></TD><TD><i>Type</i></TD><TD><i>Porté?e</i></TD><TD><i>Valeur</i></TD></TR>
<TR><TD BGCOLOR="chartreuse">list2</TD><TD BGCOLOR="chartreuse">ré?fé?rence</TD><TD BGCOLOR="chartreuse">main</TD><TD PORT="port_140407657518560" BGCOLOR="chartreuse">0x7f
b334a23ac0</TD></TR>
<TR><TD BGCOLOR="chartreuse">list1</TD><TD BGCOLOR="chartreuse">ré?fé?rence</TD><TD BGCOLOR="chartreuse">main</TD><TD PORT="port_140407657518032" BGCOLOR="chartreuse">0x7f
b334a23ac0</TD></TR>
</TABLE>
>
];
}
subgraph cluster_heap { // changed to cluster
label = HEAP;
struct_140407658920640 [label=<
<TABLE BORDER="0" CELLBORDER="1" CELLSPACING="0" BGCOLOR="gray">
<TR><TD COLSPAN="6">0x7fb334a23ac0</TD></TR>
<TR><TD COLSPAN="6" port="myport"><u>list</u></TD></TR>
<TR><TD PORT="port_child0">0x955e80</TD>
<TD PORT="port_child1">0x956360</TD>
<TD PORT="port_child2">0x956040</TD>
<TD PORT="port_child3">0x7fb3348cbbb0</TD>
<TD PORT="port_child4">0x7fb3348cbc30</TD>
<TD PORT="port_child5">0x7fb3348cbc70</TD>
</TR>
</TABLE>>];
struct_9789056 [label=<
<TABLE BORDER="0" CELLBORDER="1" CELLSPACING="0" BGCOLOR="gray">
<TR><TD>0x955e80</TD></TR>
<TR><TD><u>int</u></TD></TR>
<TR><TD>3</TD></TR>
</TABLE>>];
struct_140407658920640:port_child0 -> struct_9789056;
struct_9790304 [label=<
<TABLE BORDER="0" CELLBORDER="1" CELLSPACING="0" BGCOLOR="gray">
<TR><TD>0x956360</TD></TR>
<TR><TD><u>int</u></TD></TR>
<TR><TD>42</TD></TR>
</TABLE>>];
struct_140407658920640:port_child1 -> struct_9790304;
struct_9789504 [label=<
<TABLE BORDER="0" CELLBORDER="1" CELLSPACING="0" BGCOLOR="gray">
<TR><TD>0x956040</TD></TR>
<TR><TD><u>int</u></TD></TR>
<TR><TD>17</TD></TR>
</TABLE>>];
struct_140407658920640:port_child2 -> struct_9789504;
struct_140407657511856 [label=<
<TABLE BORDER="0" CELLBORDER="1" CELLSPACING="0" BGCOLOR="gray">
<TR><TD>0x7fb3348cbbb0</TD></TR>
<TR><TD><u>str</u></TD></TR>
<TR><TD>"go"</TD></TR>
</TABLE>>];
struct_140407658920640:port_child3 -> struct_140407657511856;
struct_140407657511984 [label=<
<TABLE BORDER="0" CELLBORDER="1" CELLSPACING="0" BGCOLOR="gray">
<TR><TD>0x7fb3348cbc30</TD></TR>
<TR><TD><u>str</u></TD></TR>
<TR><TD>"feu"</TD></TR>
</TABLE>>];
struct_140407658920640:port_child4 -> struct_140407657511984;
struct_140407657512048 [label=<
<TABLE BORDER="0" CELLBORDER="1" CELLSPACING="0" BGCOLOR="gray">
<TR><TD>0x7fb3348cbc70</TD></TR>
<TR><TD><u>str</u></TD></TR>
<TR><TD>"partez"</TD></TR>
</TABLE>>];
struct_140407658920640:port_child5 -> struct_140407657512048;
}
edge [constraint=false] // do not use next two edges to position nodes/clusters
stack:port_140407657518560 -> struct_140407658920640:myport
stack:port_140407657518032 -> struct_140407658920640
// add invisible edge & use to position nodes/clusters
edge [constraint=true style=invis]
stack:port_140407657518032 -> struct_9789056
}
Giving:

add labels to left and right of nodes

I have this input:
graph {
"1" -- "11"
"11" -- "111"
"1" -- "12"
"12" -- "121"
"12" -- "122"
}
Which will produce this graph
Is it possible to add labels to the left and right side of nodes so the output will be something like this?
No need to use strictly graphviz/DOT
It is possible with a little graphviz "hack" using labels of edges from and to the same node, but hiding the edge (penwidth=0):
graph {
edge [penwidth=0];
"1":w -- "1":w [taillabel="1"]
"1":e -- "1":e [taillabel="12"]
"11":n -- "11":w [taillabel="21"] // north - west
"11":e -- "11":e [taillabel="21"]
"12":w -- "12":w [taillabel="1"]
"12":e -- "12":e [taillabel="6"]
"111":w -- "111":w [taillabel="1"]
"111":e -- "111":e [taillabel="6"]
"121":w -- "121":w [taillabel="1"]
"121":e -- "121":e [taillabel="6"]
"122":w -- "122":w [taillabel="1"]
"122":e -- "122":e [taillabel="6"]
edge [penwidth=1];
"1" -- "11"
"11" -- "111"
"1" -- "12"
"12" -- "121"
"12" -- "122"
}
If you are flexible as to node shape, here is a solution using html-like nodes https://www.graphviz.org/doc/info/shapes.html#html. A little verbose, but easy to produce programmatically.
graph {
graph [splines=false]
node [shape=plaintext]
N1 [label=<
<TABLE BORDER="0" CELLBORDER="1" CELLSPACING="0">
<TR>
<TD BORDER="0">1</TD><TD PORT="p1">1</TD><TD BORDER="0">12</TD>
</TR>
</TABLE>>];
N2 [label=<
<TABLE BORDER="0" CELLBORDER="1" CELLSPACING="0">
<TR>
<TD BORDER="0">21</TD><TD PORT="p2">11</TD><TD BORDER="0">21</TD>
</TR>
</TABLE>>];
N3 [label=<
<TABLE BORDER="0" CELLBORDER="1" CELLSPACING="0">
<TR>
<TD BORDER="0">1</TD><TD PORT="p3">12</TD><TD BORDER="0">6</TD>
</TR>
</TABLE>>];
N4 [label=<
<TABLE BORDER="0" CELLBORDER="1" CELLSPACING="0">
<TR>
<TD BORDER="0">1</TD><TD PORT="p4">11</TD><TD BORDER="0">6</TD>
</TR>
</TABLE>>];
N5 [label=<
<TABLE BORDER="0" CELLBORDER="1" CELLSPACING="0">
<TR>
<TD BORDER="0">1</TD><TD PORT="p5">121</TD><TD BORDER="0">6</TD>
</TR>
</TABLE>>];
N6 [label=<
<TABLE BORDER="0" CELLBORDER="1" CELLSPACING="0">
<TR>
<TD BORDER="0">1</TD><TD PORT="p6">122</TD><TD BORDER="0">6</TD>
</TR>
</TABLE>>];
N1:p1--N2:p2
N1:p1--N3:p3
N2:p2--N4:p4
N3:p3--N5:p5
N3:p3--N6:p6
}
Giving:

Loop to scrape multiple elements on the same page while storing them separately

I wish to scrape multiple product names from a single page while using Scrapy
<!-- body_text //-->
<td width="601" valign="top">
<table border="0" width="100%" cellspacing="0" cellpadding="0">
<tr>
<td><img src="images/pixel_trans.gif" border="0" alt="" width="100%" height="10"></td>
</tr>
<tr>
<td class="pageHeading">Pool (Pocket Billiards) Table</td>
</tr>
<tr>
<td><img src="images/pixel_trans.gif" border="0" alt="" width="100%" height="10"></td>
</tr>
<tr>
<td class="main">A Victoria table is more than mere wood and slate. By paying attention to the details - the hidden differences - Victoria tables have become known name as masterpieces of original design and craftmanship, and most prestigious name in billiards.<br><br>
These tables, available in two sizes 9’ X 4.5’ and 8’ X 4’, are made of frames with selected good quality solid wood and finely crafted rose wood legs with Mahagony polish.<br><br>
Slate Beds used are either Indian Bangalore Black Slate or Imported Slate. Slates are covered with worsted wool cloth optionally from Jupiter (China) or Strachan (West of England cloth, U.K.) to have proper speed, accuracy and responsiveness of the table to spin. Chrome nuts and adjusters are used for leveling. It is surrounded with standard imported vulcanized 'L' shaped or 'V' shaped rubber cushions or Northern Cushions (Made in England) to cause billiard balls to rebound while minimizing the lose of kinetic energy.</td>
</tr>
<tr>
<td><img src="images/pixel_trans.gif" border="0" alt="" width="100%" height="10"></td>
</tr>
<tr>
<td>
<table cellpadding="4" cellspacing="0" width="100%" border="0" class="product_box">
<tr>
<td width="50%" valign="top" class="product_name" colspan="2"><strong><a name="vs20b"></a>VS-20B</strong></td>
</tr>
</table>
<table cellpadding="4" cellspacing="4" width="100%" border="0" >
<tr>
<td width="60%" valign="top" class="product_text"><ul><li><strong>Size: 9‘ X 4.5‘</strong></li><li>Rose Wood Legs</li><li>Mahgony Polish</li><li>S.B. Frame</li><li><strong>Bangalore Slate</strong></li><li>Standard Accessories</li></ul></td>
<td width="40%" align="center"><img src="images/products/vs-20b.jpg" alt="VS-20B" border="0" width="250px"></td>
</tr>
</table>
</td>
</tr>
<tr>
<td><img src="images/pixel_trans.gif" border="0" alt="" width="100%" height="10"></td>
</tr>
<tr>
<td>
<table cellpadding="4" cellspacing="0" width="100%" border="0" class="product_box">
<tr>
<td width="50%" valign="top" class="product_name" colspan="2"><strong><a name="vs20b"></a>VS-20C</strong></td>
</tr>
</table>
<table cellpadding="4" cellspacing="4" width="100%" border="0" >
<tr>
<td width="60%" valign="top" class="product_text"><ul><li><strong>Size: 8‘ X 4‘</strong></li><li>Rose Wood Legs</li><li>Mahgony Polish</li><li>S.B. Frame</li><li><strong>Bangalore Slate</strong></li><li>Standard Accessories</li></ul></td>
<td width="40%" align="center"><img src="images/products/vs-20c.jpg" alt="VS-20C" border="0" width="250px"></td>
</tr>
</table>
</td>
</tr>
<tr>
<td><img src="images/pixel_trans.gif" border="0" alt="" width="100%" height="10"></td>
</tr>
<tr>
<td>
<table cellpadding="4" cellspacing="0" width="100%" border="0" class="product_box">
<tr>
<td width="50%" valign="top" class="product_name" colspan="2"><strong><a name="vs23b"></a>VS-23B</strong></td>
</tr>
</table>
<table cellpadding="4" cellspacing="4" width="100%" border="0" >
<tr>
<td width="60%" valign="top" class="product_text"><ul><li><strong>Size: 9‘ X 4.5‘</strong></li><li>Rose Wood Legs</li><li>Mahgony Polish</li><li>S.A.L. Frame</li><li><strong>Imported Slate</strong></li><li>Standard Accessories</li></ul></td>
<td width="40%" align="center"><img src="images/products/vs-23b.jpg" alt="VS-23B" border="0" width="250px"></td>
</tr>
</table>
</td>
</tr>
<tr>
<td><img src="images/pixel_trans.gif" border="0" alt="" width="100%" height="10"></td>
</tr>
<tr>
<td>
<table cellpadding="4" cellspacing="0" width="100%" border="0" class="product_box">
<tr>
<td width="50%" valign="top" class="product_name" colspan="2"><strong><a name="vs23b"></a>VS-23C</strong></td>
</tr>
</table>
<table cellpadding="4" cellspacing="4" width="100%" border="0" >
<tr>
<td width="60%" valign="top" class="product_text"><ul><li><strong>Size: 8‘ X 4‘</strong></li><li>Rose Wood Legs</li><li>Mahgony Polish</li><li>S.A.L. Frame</li><li><strong>Imported Slate</strong></li><li>Standard Accessories</li></ul></td>
<td width="40%" align="center"><img src="images/products/vs-23c.jpg" alt="VS-23C" border="0" width="250px"></td>
</tr>
</table>
</td>
</tr>
<tr>
<td><img src="images/pixel_trans.gif" border="0" alt="" width="100%" height="10"></td>
</tr>
<tr>
<td>
<table cellpadding="4" cellspacing="0" width="100%" border="0" class="product_box">
<tr>
<td width="50%" valign="top" class="product_name" colspan="2"><strong><a name="vs9"></a>VS-9</strong></td>
</tr>
</table>
<table cellpadding="4" cellspacing="4" width="100%" border="0" >
<tr>
<td width="60%" valign="top" class="product_text"><ul><li><strong>Size: 9‘ X 4.5‘</strong></li><li>Auto Ball Return System</li><li>Pro Speed Cloth</li><li>American Pocket Size</li><li>Standard Accessories</li></ul></td>
<td width="40%" align="center"><img src="images/products/vs-9.jpg" alt="VS-9" border="0" width="250px"></td>
</tr>
</table>
</td>
</tr>
<tr>
<td><img src="images/pixel_trans.gif" border="0" alt="" width="100%" height="10"></td>
</tr>
<tr>
<td>
<table cellpadding="4" cellspacing="0" width="100%" border="0" class="product_box">
<tr>
<td width="50%" valign="top" class="product_name" colspan="2"><strong><a name="vs7"></a>VS-7</strong></td>
</tr>
</table>
<table cellpadding="4" cellspacing="4" width="100%" border="0" >
<tr>
<td width="60%" valign="top" class="product_text"><ul><li><strong>PLAYING AREA : 88" X 44"</strong></li><li><strong>OVERALL SIZE : 98"L X 54" W X 31" H</strong></li><li>Solid oak for top/brand rails, Dark cherry finish</li><li>Rams head solid rubber wood with # 6 leather drop pocket. Easy assembly</li></ul></td>
<td width="40%" align="center"><img src="images/products/vs-7.jpg" alt="VS-7" border="0" width="250px"></td>
</tr>
</table>
</td>
</tr>
<tr>
<td><img src="images/pixel_trans.gif" border="0" alt="" width="100%" height="10"></td>
</tr>
<tr>
<td>
<table cellpadding="4" cellspacing="0" width="100%" border="0" class="product_box">
<tr>
<td width="50%" valign="top" class="product_name" colspan="2"><strong><a name="vs8"></a>VS-8/Light Oak</strong></td>
</tr>
</table>
<table cellpadding="4" cellspacing="4" width="100%" border="0" >
<tr>
<td width="60%" valign="top" class="product_text"><ul><li><strong>POOL TABLE : 8‘ X 4‘</strong></li><li><strong>PLAYING AREA : 88" X 44"</strong></li><li><strong>OVERALL SIZE : 98" X 54"W X 31"H</strong></li><li>Solid oak for top/brand rails, Light oak finish</li><li>Rams head solid rubber wood with # 6 leather drop pocket, Easy assembly</li></ul></td>
<td width="40%" align="center"><img src="images/products/vs-8.jpg" alt="VS-8/Light Oak" border="0" width="250px"></td>
</tr>
</table>
</td>
</tr>
<tr>
<td><img src="images/pixel_trans.gif" border="0" alt="" width="100%" height="10"></td>
</tr>
<tr>
<td>
<table cellpadding="4" cellspacing="0" width="100%" border="0" class="product_box">
<tr>
<td width="50%" valign="top" class="product_name" colspan="2"><strong><a name="vs12"></a>VS-12</strong></td>
</tr>
</table>
<table cellpadding="4" cellspacing="4" width="100%" border="0" >
<tr>
<td width="60%" valign="top" class="product_text"><ul><li><strong>POOL TABLE : 8‘ X 4‘</strong></li><li><strong>PLAYING AREA : 88" X 44"</strong></li><li><strong>OVERALL SIZE : 99-3/4"L X 55 - 3/4" W X 31" H</strong></li><li>Black laminate, pedestal legs, with drop pocket, Steel frame Easy assembly. Accessories included.</li></ul></td>
<td width="40%" align="center"><img src="images/products/vs-12.jpg" alt="VS-12" border="0" width="250px"></td>
</tr>
</table>
</td>
</tr>
<tr>
<td><img src="images/pixel_trans.gif" border="0" alt="" width="100%" height="10"></td>
</tr>
<tr>
<td>
<table cellpadding="4" cellspacing="0" width="100%" border="0" class="product_box">
<tr>
<td width="50%" valign="top" class="product_name" colspan="2"><strong><a name="vs10"></a>VS-10</strong></td>
</tr>
</table>
<table cellpadding="4" cellspacing="4" width="100%" border="0" >
<tr>
<td width="60%" valign="top" class="product_text"><ul><li><strong>POOL TABLE : 8‘ X 4‘</strong></li><li><strong>PLAYING AREA : 88" X 44"</strong></li><li><strong>OVERALL SIZE : 98" L X 54"W X 31"H</strong></li><li>Solid oak for top/brand rails, oak finish</li><li>Rams head solid rubber wood with # 6 leather drop pocket, Easy assembly</li></ul></td>
<td width="40%" align="center"><img src="images/products/vs-10.jpg" alt="VS-10" border="0" width="250px"></td>
</tr>
</table>
</td>
</tr>
<tr>
<td><img src="images/pixel_trans.gif" border="0" alt="" width="100%" height="10"></td>
</tr>
<tr>
<td>
<table cellpadding="4" cellspacing="0" width="100%" border="0" class="product_box">
<tr>
<td width="50%" valign="top" class="product_name" colspan="2"><strong><a name="vs11"></a>VS-11</strong></td>
</tr>
</table>
<table cellpadding="4" cellspacing="4" width="100%" border="0" >
<tr>
<td width="60%" valign="top" class="product_text"><ul><li><strong>POOL TABLE : 8‘ X 4‘</strong></li><li><strong>PLAYING AREA : 88" X 44"</strong></li><li><strong>OVERALL SIZE : 100" X 56"</strong></li><li>Solid wood for top/brand rails</li><li>Mahogany finish</li><li>Rams head solid rubber with # 6 leather drop pocket</li></ul></td>
<td width="40%" align="center"><img src="images/products/vs-11.jpg" alt="VS-11" border="0" width="250px"></td>
</tr>
</table>
</td>
</tr>
<tr>
<td><img src="images/pixel_trans.gif" border="0" alt="" width="100%" height="10"></td>
</tr>
<tr>
<td>
<table cellpadding="4" cellspacing="0" width="100%" border="0" class="product_box">
<tr>
<td width="50%" valign="top" class="product_name" colspan="2"><strong><a name="vs13"></a>VS-13</strong></td>
</tr>
</table>
<table cellpadding="4" cellspacing="4" width="100%" border="0" >
<tr>
<td width="60%" valign="top" class="product_text"><ul><li><strong>POOL TABLE : 8‘ X 4‘</strong></li><li><strong>PLAYING AREA : 88" X 44"</strong></li><li><strong>OVERALL SIZE : 100" X 56"</strong></li><li>Solid wood for top/brand rails,</li><li>Dark cherry finish</li><li>Rams head solid rubber wood<br />
<br />
with # 6 leather drop pocket</li></ul></td>
<td width="40%" align="center"><img src="images/products/vs-13.jpg" alt="VS-13" border="0" width="250px"></td>
</tr>
</table>
</td>
</tr>
<tr>
<td><img src="images/pixel_trans.gif" border="0" alt="" width="100%" height="10"></td>
</tr>
<tr>
<td>
<table cellpadding="4" cellspacing="0" width="100%" border="0">
<tr>
<td width="50%" valign="top" class="product_name1" colspan="2"><strong>Standard Accessories for Pool</strong></td>
</tr>
</table>
<table cellpadding="4" cellspacing="4" width="100%" border="0" class="product_box1">
<tr>
<td width="50%" valign="top" class="product_text">
<ul>
<li>Aramith Pool Ball 2.1/4" or 2.1/16"</li>
<li>Table Brush</li>
<li>60" Rest Stick C/W Brass Cross Head Rest</li>
<li>Wall Cue Rack</li>
</ul></td>
<td width="50%" valign="top" class="product_text">
<ul>
<li>Plastic Triangle</li>
<li>Triangle Chalk X 12 Pcs.</li>
<li>Pool House Cue X 4 Pcs.</li>
<li>Table Cover</li>
<li>Round Type Lamp Shade X 2 Pcs.</li>
</ul></td>
</tr>
</table>
</td>
</tr>
</table></td>
<!-- body_text_eof //-->
<td width="45" valign="top">
<table border="0" width="45" cellspacing="0" cellpadding="0">
<!-- right_navigation //-->
As you can see from the code, the are fields which I want to scrape_ which are at the xpath: td[#class='product_name']/strong/a/#name
I also need to pull the images as well from this xpath: rd[#align='center']/a/img/#src
I'm exporting my data in CSV and Currently my scraper stores all the product names in one cell. I'm trying to make it such that it stores each product name and the image URL individually in a single cell in my CSV.
I tried using a loop for this but can't make it to work
My Code:
def parse(self, response):
hxs = HtmlXPathSelector(response)
titles = hxs.select("//head")
items = []
item = item()
for i in range(0,5):
item ["productname"] = titles.select("//td[#class='product_name'][i]/strong").extract()
item ["imgurl"] = titles.select("//td[#align='center'][i]/a/img/#src").extract()
items.append(item)
return(items)
names = hxs.xpath('//td[#class="product_name"]/strong/text()')
imageurls = hxs.xpath('//tr/td[#align="center"]/a/img/#src')
for name, url in zip(names, imageurls):
item["productname"] = name
item["imgurl"] = url
yield item
Simplest way of doing it since the order of the names and image urls would correspond with each other when they are extracted.
You don't need to select the elements one by one (by changing the i index in a loop as you did). The path expression below:
//td[#class='product_name']/strong/a/#name
already returns a node-set containing two items. You just have to loop over the elements that were returned to extract each attribute string.
As for the second expression:
//td[#align='center']/a/img/#src
there is only one match and you could extract the string directly.

how to get graphviz records to have cells that line up

I'm using a record node in graphviz to make a simple table, but it looks wrong:
digraph g {
node [shape = record,height=.08];
node1[label = "{DBAT|{ 0|1|2|3|4|5|6|7}|{8|9|10|11|12|13|14|15}|...|{248|249|250|251|252|253|254|255}}"];
}
Is there any way to get the subrecords to line up?
HTML-formatted nodes will probably make this easier. See http://www.graphviz.org/doc/info/shapes.html#html for details. Tables are supported.
In your case, it's impossible to do with only record labels, you should try HTML-like syntax.
https://graphviz.org/doc/info/shapes.html#html
Online Editor
digraph g {
node2[shape="none" label=<
<TABLE BORDER="0" CELLBORDER="1" CELLSPACING="0" CELLPADDING="6" COLUMNS="*">
<TR><TD COLSPAN="8">DBAT</TD></TR>
<TR>
<TD>0</TD>
<TD>1</TD>
<TD>2</TD>
<TD>3</TD>
<TD>4</TD>
<TD>5</TD>
<TD>6</TD>
<TD>7</TD>
</TR>
<TR>
<TD>8</TD>
<TD>9</TD>
<TD>10</TD>
<TD>11</TD>
<TD>12</TD>
<TD>13</TD>
<TD>14</TD>
<TD>15</TD>
</TR>
<TR><TD COLSPAN="8">...</TD></TR>
<TR>
<TD>248</TD>
<TD>249</TD>
<TD>250</TD>
<TD>251</TD>
<TD>252</TD>
<TD>253</TD>
<TD>254</TD>
<TD>255</TD>
</TR>
</TABLE>>];
}

Resources