How to Get Nokogiri to Show Node and not just HTML - ruby

Right now when I am parsing some html (front page of hacker news for example), it works fine. I can call class on something like doc = Nokogiri::HTML(open('news.ycombinator.com')) and I will get back Nokogiri::HTML::Document < Nokogiri::XML::Document
The issue is, in the terminal, I am seeing the HTML and not the actual Nokogiri Element. I want to see it because it shows me valuable info like the Nokogiri Elements Children, or an array of links or or or.
I get the HTML using the Watir Gem using the following method:
[1] pry(main)> browser = Watir::Browser.new(:firefox)
#<Watir::Browser:0x2c5654b29ef00c22 url="about:blank" title="">
[2] pry(main)> browser.goto('news.ycombinator.com')
"http://news.ycombinator.com"
[3] pry(main)> browser.html
Where browser.html is an instance variable (I think?) containing un-parsed HTML.
Here is what I get back right now if I call doc = Nokogiri::HTML.parse(browser.html)
And here is what I would like to get back:
Where am I going wrong?
adding raw code as requested:
Nokogiri::HTML::Document < Nokogiri::XML::Document
[31] pry(main)> doc = Nokogiri::HTML.parse(browser.html)
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html op="news">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta name="referrer" content="origin">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<link rel="stylesheet" type="text/css" href="news.css?stXbi7LCyutClfTUMe1b">
<link rel="shortcut icon" href="favicon.ico">
<link rel="alternate" type="application/rss+xml" title="RSS" href="rss">
<title>Hacker News</title>
</head>
<body>
<center><table id="hnmain" width="85%" cellspacing="0" cellpadding="0" border="0" bgcolor="#f6f6ef">
<tbody>
<tr><td bgcolor="#ff6600"><table style="padding:2px" width="100%" cellspacing="0" cellpadding="0" border="0"><tbody><tr>
<td style="width:18px;padding-right:4px"><img src="y18.gif" style="border:1px white solid;" width="18" height="18"></td>
<td style="line-height:12pt; height:10px;"><span class="pagetop"><b class="hnname">Hacker News</b>
new | past | comments | ask | show | jobs | submit </span></td>
<td style="text-align:right;padding-right:4px;"><span class="pagetop">
login
</span></td>
</tr></tbody></table></td></tr>
<tr id="pagespace" title="" style="height:10px"></tr>
<tr><td>
<table class="itemlist" cellspacing="0" cellpadding="0" border="0">
<tbody>
<tr class="athing" id="19388248">
<td class="title" valign="top" align="right"><span class="rank">1.</span></td> <td class="votelinks" valign="top"><center><a id="up_19388248" href="vote?id=19388248&how=up&goto=news"><div class="votearrow" title="upvote"></div></a></center></td>
<td class="title">
Getting Too Absorbed in Your Side Projects<span class="sitebit comhead"> (<span class="sitestr">bennettnotes.com</span>)</span>
</td>
</tr>
<tr>
<td colspan="2"></td>
<td class="subtext">
<span class="score" id="score_19388248">42 points</span> by _davebennett <span class="age">1 hour ago</span> <span id="unv_19388248"></span> | hide | 27 comments </td>
</tr>
<tr class="spacer" style="height:5px"></tr>
<tr class="athing" id="19384878">
<td class="title" valign="top" align="right"><span class="rank">2.</span></td> <td class="votelinks" valign="top"><center><a id="up_19384878" href="vote?id=19384878&how=up&goto=news"><div class="votearrow" title="upvote"></div></a></center></td>
<td class="title">
Facebook’s Data Deals Are Under Criminal Investigation<span class="sitebit comhead"> (<span class="sitestr">nytimes.com</span>)</span>
</td>
</tr>
<tr>
<td colspan="2"></td>
<td class="subtext">
<span class="score" id="score_19384878">661 points</span> by tysone <span class="age">13 hours ago</span> <span id="unv_19384878"></span> | hide | 156 comments </td>
</tr>
<tr class="spacer" style="height:5px"></tr>
<tr class="athing" id="19388091">
<td class="title" valign="top" align="right"><span class="rank">3.</span></td> <td class="votelinks" valign="top"><center><a id="up_19388091" href="vote?id=19388091&how=up&goto=news"><div class="votearrow" title="upvote"></div></a></center></td>
<td class="title">
Krita 4.2.0: First painting application with HDR support on Windows<span class="sitebit comhead"> (<span class="sitestr">krita.org</span>)</span>
</td>
...

It sounds like you want:
doc = Nokogiri::HTML browser.html

Related

How to use REST API PdfReactor / MS Flow Post Request

I would like to convert a simple html file to PDF using PDF Reactor and MS Flow.
I set up a PdfReactor running in a docker container.
Can somebody help me to get the http post request right to have PdfReactor convert the file to pdf?
PdfReactor Documentation
The payload for most of POST methods of the PDFreactor Web Service must be in XML, JSON or ZIP format (see also https://www.pdfreactor.com/product/doc_html/index.html#payload).
So you should set the „body“ of your request to a JSON like the following:
{
document: "Your File Content"
}
NEW Flow HTTP Request Picture
With this request i was able to pass through the payload.
The content of the "html" Variable is going to look like this.
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<title>Inspectionlist 0583 / 16.05.2020</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<style type="text/css">
<!--
.Stil1 {
font-size: 36px;
font-weight: bold;
}
.Stil11 {font-size: 44px}
.Stil12 {font-size: 36px}
-->
</style>
</head>
<body>
<table width="100%" border="0" cellspacing="0" cellpadding="0">
<tr>
<td width="39%"><div align="center" class="Stil1">
<p><span class="Stil11">I n s p e c t i o n s </span> <br>
16.05.2020<br>
</p>
</div></td>
<td width="32%"><div align="center"><span class="Stil11"><span class="Stil12">Ship-No.:</span> <span class="Stil1">0583</span></span> <br>
</div></td>
</tr>
</table>
<table width="100%" border="1" align="center" cellpadding="5" cellspacing="0" bordercolor="#999999">
<tr bgcolor="#CCCCCC">
<th width="20" scope="col"><strong>No.</strong></th>
<th width="70" scope="col"><strong>Start of insp. </strong></th>
<th width="70" scope="col"><strong>End of insp. </strong></th>
<th width="30" scope="col"><strong>Class.</strong></th>
<th width="30" scope="col"><strong>Yard</strong></th>
<th width="30" scope="col"><strong>Owner</strong></th>
<th width="158" scope="col"><strong>Responsible</strong></th>
<th width="30" scope="col"><div align="center"><strong>BGN</strong></div></th>
<th width="481" scope="col"><strong>Description</strong></th>
<th width="130" scope="col"><strong>Pre-Inspection Yard </strong><strong>Contractor</strong></th>
</tr>
<tr>
<td width="20"><div align="center"><span class="Stil8"></span></div></td>
<td width="70"><div align="center"><span class="Stil8"><B><FONT SIZE="6">No Inspections!<B></FONT> </span></div></td>
<td width="70"><div align="center"><span class="Stil8"></span></div></td>
<td width="30"><div align="center"><span class="Stil8"></span></div></td>
<td width="30"><div align="center"><span class="Stil8"></span></div></td>
<td width="30"><div align="center"><span class="Stil8"></span></div></td>
<td><div align="center"><span class="Stil8"></span></div></td>
<td width="30"><div align="center"><span class="Stil8"></span></div></td>
<td><p align="left" class="Stil3 Stil6"><strong>No.</strong> <strong>Location:</strong> <br><br>
<br></p>
</td>
<td><div align="center"><span class="Stil8"><br>
</span></div> <div align="center"></div></td>
</tr>
</table>
<center> printed on: 15.05.2020 - 19:15 </center>
</body>
</html>
After the conversion the pdf is not readable by e.g acrobat, do you guys know what i'm missing?

Attribute th:each is not allowed here(Error in Thymeleaf template)

I have created the application that stores the Information about dogs in the database,while running the project the tables where created but dogs information were not updates.
There is an error in running this below html file
The following code is not working
<html lang="en">
<head>
<!-- META SECTION -->
<title>Dog Rescue</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<!-- END META SECTION -->
<!-- BEGIN STYLE -->
<style>
table, th, td {
border: 1px solid black;
padding: 1px;
}
</style>
<!-- END STYLE -->
</head>
<body>
<h2>Current Dogs In Rescue</h2>
<table>
<thead>
<tr>
<th>ID</th>
<th>Name</th>
<th>Rescue Date</th>
<th>Vaccinated</th>
</tr>
</thead>
<tbody>
<tr th:each="dogs : ${dogs}">
<td th:text="${dogs.id}">Text ...</td>
<td th:text="${dogs.name}">Text ...</td>
<td th:text="${dogs.rescued}">Text ...</td>
<td th:text="${dogs.vaccinated}">Text...</td>
</tr>
</tbody>
</table>
</div>
<h2>Add A Dog</h2>
<form action="#" th:action="#{/}" method="post">
<label>Name<input type="text" name="name" id="name"></input></label>
<label>Vaccinated<input type="text" name="vaccinated" id="vaccinated"></input></label>
<label>Rescued<input type="text" name="rescued" id="rescued"></input></label>
<input type="submit" value="Submit"></input>
</form>
</body>
</html>
The html file is not fetching the information.
Kindly help me
Whole Project is available in
https://github.com/arulsuju/DogRescue.git
You are using the same variable name for iteration as the list variable (dogs)
, Consider using different name for iteration variable like (dog), so the code should be:
<tr th:each="dog : ${dogs}">
<td th:text="${dog.id}">Text ...</td>
<td th:text="${dog.name}">Text ...</td>
<td th:text="${dog.rescued}">Text ...</td>
<td th:text="${dog.vaccinated}">Text...</td>
</tr>

MSO Conditionals - !mso failing?

I'm working with some email template design for my company and leveraging off of the Zurb Foundation for Email framework (http://foundation.zurb.com/emails). So far, I've been impressed with it.
The issue that I am having is with an column background that will have different text in it depending on the recipient (dynamic). The background is basically a rounded "button" shape with a transparent "Arrow" on the right hand side. Long story short - I was able to design this so it looked "good" in modern email clients using some tables with some basic CSS.
The issue with this was that my CSS uses "border-radius" and outlook doesn't support that. I found a workaround to this and "simplified" the design for outlook specifically and use the MSO conditional to fire off this simplified design when appropreate. The issue is that it ALWAYS seems to fire - no matter what email client I am using. . . (iPhone, gMail, etc). I think something has to be wrong with the way I am setting up the conditional.
<table class="row center">
<tr>
<td class="wrapper last panel">
<!--[if mso]>
<table class="twelve columns">
<tr>
<td class="one sub-columns">
Gift Code:
</td>
<td class="eleven sub-columns">
<v:roundrect xmlns:v="urn:schemas-microsoft-com:vml" xmlns:w="urn:schemas-microsoft-com:office:word" style="height:40px; v-text-anchor:middle; width:500px;" arcsize="20%" stroke="f" fillcolor="#faa21a">
<w:anchorlock/>
<center style="color:#ffffff;font-family:sans-serif;font-size:16px;font-weight:bold;">
ZZ1234567890ABCD
</center>
</v:roundrect>
</td>
<td class="expander"></td>
</tr>
</table>
<![endif]-->
<!--[if !mso]>
<!-- -->
<table class="twelve columns" style="mso-hide:all;">
<tr>
<td class="one sub-columns">
Gift Code:
</td>
<td class="nine sub-columns promoCalloutInner alOrangeBg" style="mso-hide:all;">
ZZ1234567890ABCD
</td>
<td class="four sub-columns alOrangeBg promoCalloutInnerEnd" style="mso-hide:all;">
<img src="http://mcbain.gamelogic.com/~rdesroches/ALCEmailTemplates/images/transparentArrow.png" />
</td>
<td class="expander"></td>
</tr>
</table>
<!-- <![endif]-->
</td>
</tr>
</table>
I am using the Zurb Inliner tool (http://foundation.zurb.com/emails/inliner.html) to inline all the styles from my CSS.
Any ideas?
It looks like your non-Outlook conditional content (<!--[if !mso]>) isn't closing correctly.
Try this and let me know how you get on
<table class="row center">
<tr>
<td class="wrapper last panel">
<!--[if mso]>
<table class="twelve columns">
<tr>
<td class="one sub-columns">
Gift Code:
</td>
<td class="eleven sub-columns">
<v:roundrect xmlns:v="urn:schemas-microsoft-com:vml" xmlns:w="urn:schemas-microsoft-com:office:word" style="height:40px; v-text-anchor:middle; width:500px;" arcsize="20%" stroke="f" fillcolor="#faa21a">
<w:anchorlock/>
<center style="color:#ffffff;font-family:sans-serif;font-size:16px;font-weight:bold;">
ZZ1234567890ABCD
</center>
</v:roundrect>
</td>
<td class="expander"></td>
</tr>
</table>
<![endif]-->
<!--[if !mso]><!-->
<table class="twelve columns" style="mso-hide:all;">
<tr>
<td class="one sub-columns">
Gift Code:
</td>
<td class="nine sub-columns promoCalloutInner alOrangeBg" style="mso-hide:all;">
ZZ1234567890ABCD
</td>
<td class="four sub-columns alOrangeBg promoCalloutInnerEnd" style="mso-hide:all;">
<img src="http://mcbain.gamelogic.com/~rdesroches/ALCEmailTemplates/images/transparentArrow.png" />
</td>
<td class="expander"></td>
</tr>
</table>
<!--<![endif]-->
</td>
</tr>
</table>
What I changed:
<!--[if !mso]>
<!-- -->
to
<!--[if !mso]><!-->
and
<!-- <![endif]-->
to
<!--<![endif]-->

Add image at background, outlook problems

have problem like in title. Have email template, everything are ok, but background image doesn't work on outlook. Unfortunately most people in my country use it. Below my code:
<!--#subject Email - Header #-->
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="viewport" content="initial-scale=1.0, width=device-width" />
</head>
<body>
{{var non_inline_styles}}
<!-- Start Image Background -->
<table class="image-background" cellpadding="0" cellspacing="0" border="0" width="100%" bgcolor="22262b" background="http://met.ivycommerce.eu/glamoura_main_image.jpg">
<tr>
<td class="image-background" align="center" style="background: url('http://met.ivycommerce.eu/pattern.png') 0 0 repeat">
<table class="container-table" cellpadding="0" cellspacing="0" border="0">
<tr>
<td class="separator-50" height="50"> </td>
</tr>
<!-- Start Three Column -->
<tr>
<td>
<table class="container-table" cellpadding="0" cellspacing="0" border="0">
<tr>
<td class="center" align="center">
<a href="{{store url=""}}">
<img style="display: block; margin: 0 auto;"
{{if logo_width}}
width="{{var logo_width}}"
{{else}}
width="165"
{{/if}}
{{if logo_height}}
height="{{var logo_height}}"
{{else}}
height=""
{{/if}}
src="{{var logo_url}}"
alt="{{var logo_alt}}"
border="0"/>
</a>
</td>
</tr>
</table>
</td>
</tr>
<!-- Start Three Column -->
<tr>
<td class="separator-30" height="30"> </td>
</tr>
<tr>
<td align="center">
<table align="center" border="0" cellpadding="0" cellspacing="0" width="60">
<tr>
<td class="separator-line" bgcolor="#ffffff" height="1"> </td>
</tr>
</table>
</td>
</tr>
<tr>
<td class="separator-30" height="30"> </td>
</tr>
<tr>
<td class="colored-heading" align="center" height="28">
<div style="line-height: 28px;">Nasza Misja</div>
</td>
</tr>
<tr>
<td class="heading" align="center" height="28">
<div style="line-height: 28px;">To uśmiechnięci Klienci</div>
</td>
</tr>
<tr>
<td class="separator-10" height="10"> </td>
</tr>
<tr>
<td class="sub-heading" align="center" height="24">
<div style="line-height: 24px;">Dziękujemy, że dołączyłeś do wielkiego grona Bionaturalnych.</div>
</td>
</tr>
<tr>
<td class="separator-50" height="50"> </td>
</tr>
<tr>
<td align="center">
<table align="center" border="0" cellpadding="0" cellspacing="0">
<tr>
<td class="button" align="center" valign="middle" height="38" width="140">
Wejdź Do Sklepu
</td>
</tr>
</table>
</td>
</tr>
<tr>
<td class="separator-50" height="50"> </td>
</tr>
</table>
</td>
</tr>
</table>
<!-- End Image Background -->
<!-- Begin wrapper table -->
<table width="100%" cellpadding="0" cellspacing="0" border="0" id="background-table">
<tr>
<td valign="top" class="container-td" align="center">
<table cellpadding="0" cellspacing="0" border="0" align="center" class="container-table">
<tr>
<td valign="top" class="top-content">
<!-- Begin Content -->
I was try with some code, but not success :( when background are ok, rest align left, but have to be center. Can somebody help me?
Outlook generally requires old school type programming and has a lot of things that are not permitted in email that are allowed in website programming. This means that the style works best when included in the header and background images cannot be used. If you want to use a background image, say behind a title for example, you would need to create the title and background as a single image to insert as you would any other image. The downside to this is that Outlook will not download images unless the user specifies to do so. Since a majority of your users use Outlook, best practices says to program for them and then test it across all other browsers. Here is a list of issues identified by Mailchimp, an email service provider, that specifically pertain to Outlook and what fixes, if available, can be performed. http://kb.mailchimp.com/campaigns/previews-and-tests/my-campaign-looks-bad-in-outlook.
In reviewing your code provided, here are some other things that I think you might want to consider:
Use fixed measurements instead of percentages;
Instead of using code for spacing, use a 1px x 1px transparent or background colored image that you can adjust the height and the width of to make the space work the way you want;
Put all styles at the top in css format, for example:
<style type="text/css">
body{
margin:0;
padding:0;
font-family:"Trebuchet MS", arial, sans-serif;
}
</style></head>
I have been programming email newsletters for almost ten years now and the most simple, basic code works the best when the majority of your users have Outlook. Finally, I recommend that you always run your code through an html validator, such as https://validator.w3.org/ to check your code. Even the smallest error can result in unanticipated results with Outlook or other browsers.
Zydol,
To make this work, you will need to use VML to create an object and assign the background image to that. You can read more about this technique here: https://www.emailonacid.com/blog/article/email-development/emailology_vector_markup_language_and_backgrounds
I do believe Bulletproof Backgrounds is the answer you're looking for!
https://backgrounds.cm/

Ruby list files in remote http server

I have a files listing page on a remote server, say http://myserver.com/uploads. How can I get the list of files using Ruby, preferably with net-http only?
This is the HTML code of the page:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<!-- saved from url=(0025)http://myserver.com/uploads/ -->
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title>Index of /uploads</title>
</head>
<body>
<h1>Index of /uploads</h1>
<table>
<tbody>
<tr>
<th><img src="./Index of uploads_files/blank.gif" alt="[ICO]"></th>
<th>Name</th>
<th>Last modified</th>
<th>Size</th>
<th>Description</th></tr><tr><th colspan="5"><hr></th>
</tr>
<tr>
<td valign="top"><img src="./Index of uploads_files/back.gif" alt="[DIR]"></td>
<td>Parent Directory</td>
<td> </td>
<td align="right"> - </td>
<td> </td>
</tr>
<tr>
<td valign="top"><img src="./Index of uploads_files/compressed.gif" alt="[ ]"></td>
<td>Backup_201305281256.tar.gz</td>
<td align="right">28-May-2013 18:00 </td>
<td align="right"> 13M</td><td> </td>
</tr>
<tr><th colspan="5"><hr></th></tr>
</tbody>
</table>
<address>Apache/2.2.22 (Ubuntu) Server at myserver.com Port 80</address>
</body>
</html>
What you see is an HTML page with link to files generated by the HTTP server.
You'll need to parse this HTML to get list the files or you use a regex to match the URI's.
Take a look at the URI regex.

Resources