Can I use sightly display context along with i18n label? - internationalization

I am seeing a code in my project as
${'myproj.label' #i18n, format=[sighltyObj.field1], context='text'}
Intention is pass a variable into i18n text + encode the texts safely. Is this right to use display context along with i18n translations? When I tested with a field1 = "Hello%20World", it is NOT encoding the texts rather rendering as is.
How can I encode html strings while passing the arguments as variables into i18n?

HTL will not decode the text returned by format. I think the confusion comes from the documentation which states for the display context text the following:
Use this for simple HTML content - Encodes all HTML
(Source: HTL Specification Section 1.2.1 Display Context)
But this does not mean that this context decodes anything, it encodes HTML tags.
So if sighltyObj.field1 is Hello%20World it will not be rendered as Hello World but as Hello%20World as you already noticed.
The display context text will encode all HTML tags in the given text so that you can't "smuggle" them into a text (see code injection).
So for example:
${'<p>This is my text</p>' # context='text'}
will create the following HTML
<p>This is my text</p>
Note how the p tags were encoded:
<p> became <p> and </p> became </p&gt.
The getter for field1 in your sighltyObj will have to do the decoding so that Hello%20World becomes Hello World. There is already a answer on Stackoverflow that shows you how to do this: https://stackoverflow.com/a/6138183/190823
String result = java.net.URLDecoder.decode(url, "UTF-8");

Related

How do I do strikethrough (line-through) in asciidoc?

How do I render a strikethrough (or line-through) in an adoc file?
Let's presume I want to write "That technology is -c-r-a-p- not perfect."
That technology is [line-through]#crap# not perfect.
As per Ascii Doc manual, [line-through] is deprecated. You can test here.
Comment from Dan Allen
It's important to understand that line-through is just a CSS role. Therefore, it needs support from the stylesheet in order to appear as though it is working.
If I run the following through Asciidoctor (or Asciidoctor.js):
[.line-through]#strike#
I get:
<span class="line-through">strike</span>
The default stylesheet has a rule for this:
.line-through{text-decoration:line-through}
You would need to do the same.
It is possible to customize the HTML that is generated using custom templates (Asciidoctor.js supports Jade templates). In that case, you'd override the template for inline_quoted, check for the line-through role and produce either an <s> or, preferably, a <del> instead of the span.
If you're only targeting the HTML backend, you can insert HTML code verbatim via a passthrough context. This can be done inline by wrapping the parts in +++:
That technology is +++<del>+++crap+++</del>+++ not perfect.
This won't help you for PDF, DocBook XML, or other output formats, though.
If the output is intended for HTML you can pass HTML.
The <s> HTML element renders text with a strikethrough, or a line
through it. Use the element to represent things that are no longer
relevant or no longer accurate.
https://developer.mozilla.org/en-US/docs/Web/HTML/Element/s
To render as:
Example text.
use:
1. Pass inline:
Example +++<s>text</s>+++.
2. Pass-through macro:
Example pass:[<s>text</s>].
3. Pass block:
++++
Example <s>text</s>.
++++

Cast a Nokogiri::XML::Document to a Nokogiri::HTML::Document

I want to transform an XML document to HTML using XSL, tinker with it a little, then render it out. This is essentially what I'm doing:
source = Nokogiri::XML(File.read 'source.xml')
xsl = Nokogiri::XSLT(File.read 'transform.xsl')
transformed = xsl.transform(source)
html = Nokogiri::HTML(transformed.to_html)
html.title = 'Something computed'
Stylesheet::transform always returns XML::Document, but I need a HTML::Document instance to use methods like title=.
The code above works, but exporting and re-parsing as HTML is just awful. Since the target is a subclass of the source, there must be a more effective way to perform the conversion.
How can I clean up this mess?
As a side question, Nokogiri has generally underwhelmed me with its handling of doctypes, unawareness of <meta charset= etc... does anyone know of a less auto-magic library with similar capabilities?
Many thanks ;)
HTML::Document extends XML::Document, but the individual nodes in a HTML document are just plain XML::Nodes, i.e. there aren’t any HTML::Nodes. This suggests a way of converting an XML document to HTML by creating a new empty HTML::Document and setting its root to that of the XML document:
html = Nokogiri::HTML::Document.new
html.root= transformed.root
The new document has the HTML methods like title= and meta_encoding= available, and when serializing it creates a HTML document rather than HTML: adds a HTML doctype, correctly uses empty tags like <br>, displays minimized attributes where appropriate (e.g. <input type="checkbox" selected>) and doesn’t escape things like > in <script> blocks.

Passing JSON as HTML element text

Would there be bad consequences from transporting JSON in HTML like this:
<div id="json" style="display: none;">{"foo": "bar"}</div>
assuming HTML chars such as < are escaped as < in the element text?
The JSON could be strictly parsed:
var blah = $.parseJSON($('#json').html())
in a try/catch statement, for example. The rationale is to enable passing of JSON in Ajax'd HTML responses, when script tags are being stripped an not executed. An example would be Ajax requests made using the jQuery .load() special selector syntax:
$('#here').load('some.html #fragment')
...which ditches all script tags and thus prevents the use of:
<script>var blah = {"foo":"bar"}</script>
I've seen JSON being passed around in HTML attributes, and I'd guess this is equivalent - w.r.t. weirdness, security, etc - but is far less readable due to all the additional quote-escaping.
The natural way of passing JS data in HTML is through JavaScript code (if is a part of actual JavaScript code, like in the case of initial values/configuration) or by data- HTML5 attributes (whenever JS code is not necessary; always when data needs to be somehow attached to DOM elements).
In your example this would be probably the best:
<div id="json" style="display: none;"
data-something="{"foo":"bar"}">
</div>
but reorganize your data to actually follow HTML structure:
<div class="profile-container"
data-profile="{"name":"John Doe","id":123}">
... profile 123 ...
</div>
<div class="profile-container"
data-profile="{"name":"Jane Doe","id":321}">
... profile 321 ...
</div>
(quoting should be done server-side, eg. using PHP's htmlspecialchars(...), or Python's cgi.escape(..., True)).
And then you can obtain the data in one of multiple ways, eg. using jQuery's .data() method.
EDIT:
Yes, your approach with embedding JSON as content of HTML tags and hiding it using CSS styles has gotchas. As I said, if you want to pass data in HTML, the only "best practice" way is to attach it to one of HTML elements (you are kind-of doing it anyway, but you use CSS to hide it, while you can use existing solutions for passing JSON/data without affecting clients that could override your styles). The proof for one of disadvantages is here: http://jsfiddle.net/NY7Bs/ (data is passed both ways, but one simple external style overrides your inline styles and shows the content - not mentioning the influence on semantics of your document).
Why not simply use the .ajax() function then, you would get only the string with the json. Then you could parse it as you suggested.

Convert HTML to plain text and maintain structure/formatting, with ruby

I'd like to convert html to plain text. I don't want to just strip the tags though, I'd like to intelligently retain as much formatting as possible. Inserting line breaks for <br> tags, detecting paragraphs and formatting them as such, etc.
The input is pretty simple, usually well-formatted html (not entire documents, just a bunch of content, usually with no anchors or images).
I could put together a couple regexs that get me 80% there but figured there might be some existing solutions with more intelligence.
First, don't try to use regex for this. The odds are really good you'll come up with a fragile/brittle solution that will break with changes in the HTML or will be very hard to manage and maintain.
You can get part of the way there very quickly using Nokogiri to parse the HTML and extract the text:
require 'nokogiri'
html = '
<html>
<body>
<p>This is
some text.</p>
<p>This is some more text.</p>
<pre>
This is
preformatted
text.
</pre>
</body>
</html>
'
doc = Nokogiri::HTML(html)
puts doc.text
>> This is
>> some text.
>> This is some more text.
>>
>> This is
>> preformatted
>> text.
The reason this works is Nokogiri is returning the text nodes, which are basically the whitespace surrounding the tags, along with the text contained in the tags. If you do a pre-flight cleanup of the HTML using tidy you can sometimes get a lot nicer output.
The problem is when you compare the output of a parser, or any means of looking at the HTML, with what a browser displays. The browser is concerned with presenting the HTML in as pleasing way as possible, ignoring the fact that the HTML can be horribly malformed and broken. The parser is not designed to do that.
You can massage the HTML before extracting the content to remove extraneous line-breaks, like "\n", and "\r" followed by replacing <br> tags with line-breaks. There are many questions here on SO explaining how to replace tags with something else. I think the Nokogiri site also has that as one of the tutorials.
If you really want to do it right, you'll need to figure out what you want to do for <li> tags inside <ul> and <ol> tags, along with tables.
An alternate attack method would be to capture the output of one of the text browsers like lynx. Several years ago I needed to do text processing for keywords on websites that didn't use Meta-Keyword tags, and found one of the text-browsers that let me grab the rendered output that way. I don't have the source available so I can't check to see which one it was.

How do I add an image to an item in RSS 2.0?

Is there a way to send only an Image with a link and some alt text for each item in an RSS feed?
I looked at the enclosure tag but this is only for videos and music.
The enclosure element can be used to transmit pictures. The RSS 2.0 spec is quite clear about that, saying that the type is a MIME type. It does not say it is restricted to audio or video.
Here's an example: a set of photo feeds from Agence France Presse
One of solutions is to use CDATA in description
<![CDATA[
Image inside RSS
<img src="http://example.com/img/smiley.gif" alt="Smiley face">
]>
Note, that you may have a problem with hotlink prevented site.
This is possible in RRS2,
see
http://cyber.law.harvard.edu/rss/rss.html#ltenclosuregtSubelementOfLtitemgt
So you have to use the enclosure tag, to add media
You should use the enclosure tag within item to include the image. You can use it for images by setting the correct Mime Type (for example: image/jpeg) and including the image size as the "length" attribute. The length attribute doesn't need to be completely accurate but it's required for the RSS to be considered valid.
Here's a helpful article that discusses this and other options.
To work with the Mailchimp RSS to email feature, they expect the image to be specified in a <media:content> element inside <item>. This is their source for the feed item's image macro in their templates.
Thus, you need to add to the declarations
xmlns:media="http://search.yahoo.com/mrss/
Then inside the <item> element add
<media:content medium="image" url="http://whatever/foo.jpg" width="300" height="201" />
Without the extra declaration, the feed is invalid since media:content is not a known element.
Inside tag ITEM
<image:image xmlns:image="http://web.resource.org/rss/1.0/modules/image/">
http://domain. com/image.jpg
< /image:image>
Inside Description Tag
<![CDATA[
Some Text..
<br/><img src='http://domain. com/image.jpg' ><br/>
More Text
]]>
Regarding the <p> tag issue, You need to encode html within the xml.
Your code would look something like this:
<description><p> Text in the tag </p></description>
Since you are using php you can use htmlentities() to encode the html tags. They look horrible in the xml but RSS readers know what to do with it.
http://php.net/manual/en/function.htmlentities.php

Resources