How can I handle an ampersand ("&") character in a Telerik HTML textbox?
While rendering, it's giving me an error. Also, does anybody know about any other character that may cause errors in an HTML textbox?
Ampersand is a special character in HTML that specifies the start of an escape sequence (so you can do something like © to get a copyright symbol, etc.). If you want to display an ampersand you have to escape it. So if you replace all ampersands with &, that should take care of the error.
However, if there were ampersands in your input that were already escaped - like maybe your data had © - you wouldn't want to escape that ampersand. But if your data won't have any of these ampersands, a simple replace should be fine.
You also need to replace greater than and less than symbols (> and <) with > and < respectively.
Telerik talks about these limitations/issues on this page http://www.telerik.com/help/reporting/report-items-html-text-box.html
Also according to the HTML specification (and the general XML
specification as well) the "&", "<" and ">" characters are considered
special (markup delimiters), so they need to be encoded in order to be
treated as regular text. For example the "&" character can be escaped
with the "&" entity. More information on the subject you can find in
this w3.org article.
Related
I am trying to remove white-space from japanese word.
input "かいしゃ(会社)"
output "かいしゃ(会社)"
The space here is consumed by the parentheses. They are not your regular ASCII parentheses, they are of the "full width" flavor.
If you want to replace them with ASCII parentheses, you can do it like this:
compact_input = input.gsub("\uFF08", '(') # and a similar step for the closing parenthesis
Although this might make your string look weird in japanese (I don't know the language well enough, so can't say)
I'm trying to build a DSL which will contain a number of XPaths as parameters. I'm new to XPath, and I need a character which is never used in the XPath syntax so I can delimit n number of XPaths on a single line of a script. My question: what characters are NOT part of the XPath syntax?
The null character.
Seriously. Because an XPath is supposed to support any XML document, it must be capable of matching text nodes that contain any allowed Unicode character. However, XML disallows one character: the null character.
Ok, that is not entirely true, but it is simplest. As in XML 1.1, control characters were supported, except Unicode Null. However, as per the XML 1.0 production of Char, there are a few other characters you can choose from: surrogate pairs (as characters, not as correctly encoded octets representing a non-BMP character), and anything before 0x20, except linefeed, carriage return and tab.
Another good guess is any Private Use character, as it is unlikely it is used by your input documents, however, this is not guaranteed, and you asked for "never".
I'm trying to build a DSL which will contain a number of XPaths as parameters.
Well, many people use XML for DSLs, and this is how you would do it in XML:
<paths>
<path>/a/b/c/d</path>
<path>/w/x/y/z</path>
</path>
So how do we reconcile this with the fact that "<" can appear in an XPath expression? Answer: if it does appear, we escape it:
<paths>
<path>/a/b/c/d[e < 3]</path>
<path>/w/x/y/z[v < 2]</path>
</path>
So: don't try to find a character that can't appear in an XPath expression. Use a character that can appear, and escape it if it does.
I would like to use the back-tick in regular text (not in a code snippet) in TW5. Is this possible?
It is also possible to use the hex code (`) or the HTML code (`) for back-tick
I wanted to display a back-tick in a code block, so used:
<code>`</code>
Using the HTML code looks like:
<code>`</code>
This has the advantage that you're not disabling any parsing rules.
In TiddlyWiki5 you can disable certain parsing rules using the \rules pragma
A pragma is a special component of WikiText that provides control over the way the remaining text is parsed.
http://tiddlywiki.com/#Pragma
So if you add
\rules except codeinline
at the very(!) beginning of your tiddler text, any following backtick symbol in the text is not interpreted as special character.
This comes however at the cost that you cannot use this symbol as wikitext-directive anymore to achieve inline-code for programming snippets. Instead you would need to add the html code tag manually.
If I were to take the error logs of a file, and put them on a properly formatted HTML page, the page might not pass validation by W3C's validation tool because the error text contains markup that the validation tool gets confused with. How can I properly mark such text such that the validation tool realizes that it's reading what should be interpreted as a normal string and not markup?
You need to convert the following characters into their HTML entities (stolen from the PHP manual on htmlspecialchars()):
'&' (ampersand) becomes '&'
'"' (double quote) becomes '"'
''' (single quote) becomes '''
'<' (less than) becomes '<'
'>' (greater than) becomes '>'
If it really is a normal string, then you should represent the characters that have special meaning in HTML with their respective entities. < to < etc.
This will, of course, cause browsers to render those characters as text and not tags.
If you want browsers to render them as tags, then they aren't a normal string and they are a validity error.
In my Ruby app, I've used the following method and regular expression to remove all HTML tags from a string:
str.gsub(/<\/?[^>]*>/,"")
This regular expression did just about all I was expecting it to, except it caused all quotation marks to be transformed into “
and all single quotes to be changed to ”
.
What's the obvious thing I'm missing to convert the messy codes back into their proper characters?
Edit: The problem occurs with or without the Regular Expression, so it's clear my problem has nothing to do with it. My question now is how to deal with this formatting error and correct it. Thanks!
Use CGI::unescapeHTML after you perform your regular expression substitution:
CGI::unescapeHTML(str.gsub(/<\/?[^>]*>/,""))
See http://www.ruby-doc.org/core/classes/CGI.html#M000547
In the above code snippet, gsub removes all HTML tags. Then, unescapeHTML() reverts all HTML entities (such as <, “) to their actual characters (<, quotes, etc.)
With respect to another post on this page, note that you will never ever be passed HTML such as
<tag attribute="<value>">2 + 3 < 6</tag>
(which is invalid HTML); what you may receive is, instead:
<tag attribute="<value>">2 + 3 < 6</tag>
The call to gsub will transform the above to:
2 + 3 < 6
And unescapeHTML will finish the job:
2 + 3 < 6
You're going to run into more trouble when you see something like:
<doohickey name="<foobar>">
You'll want to apply something like:
gsub(/<[^<>]*>/, "")
...for as long as the pattern matches.
This regular expression did just about
all I was expecting it to, except it
caused all quotation marks to be
transformed into “ and all
single quotes to be changed to ”
.
This doesn't sound as if the RegExp would be doing this. Are you sure it's different before?
See this question here for information about the problem, it has got an excellent answer:
Get non UTF-8 form fields as UTF-8 in php.
I've run into a similar problem with character changes, this happened when my code ran through another module that enforced UTF-8 encoding and then when it came back, I had a different file (slurped array of lines) on my hands.
You could use a multi-pass system to get the results you are looking for.
After running your regular expression, run an expression to convert &8220; to quotes and another to convert &8221; to single quotes.