Why there is no utf-8 encoding in firefox?
Maybe it is wrong for firefox to write encoding unicode in the encoding line,should utf-8 or unicode encoding be displayed in the encdoing line?
What is the reason?
This option is UTF-8, yes. It used to say “Unicode (UTF-8)” which was clearer.
It seems when the encoding menu was tidied (bug 805374 I think) the encoding labels were made ‘friendlier’ by replacing the technical encoding name with a more general description, or removing it when it's the only selectable option.
It makes sense that other UTF encodings are not included: as non-ASCII-compatible encodings they can't easily be mistaken and switched between; UTF-8 is the only Unicode-family encoding that fits here. But the result of calling UTF-8 just “Unicode” is unfortunate in that Microsoft have always (misleadingly) used the term “Unicode” to mean UTF-16LE.
The reasoning(as per my understanding) not to add it as utf-8 might be because it allows the user to set the utf encoding as per your need like utf-16 or utf-8 etc.
Firefox uses Unicode and to use it it uses charset=utf-8
You need to understand that Firefox will use the encoding specified in a meta tag if the server does not send encoding via the HTTP response headers.
Its specified like this:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
Related
I am using Spring Boot version 2.0.5. and liquid template version 0.7.8
My problem is when I am using German text in the template file and when sending mail then few German characters converted into ? mark.
So what is the solution for this?
Somewhere along the path from the text file template, through processing and sending out as an email the character encoding is being mangled, so that the German characters, encoded in one scheme, are being incorrectly rendered as the wrong "glyph" in the other scheme, in the email.
The first things to check are what the encoding is for the template file. Then investigate how the email is being rendered. For example if it is an HTML email see if there is a character encoding reference in the header with a different encoding, e.g.:
<head><meta charset="utf-8" /></head>
If this differs from the encoding of the file, e.g. ISO-8859-1, then the first thing I would try is to resave the template in UTF-8, you should be able to do that within most IDEs or advanced text editors such as Notepad++
(As the glyphs are question marks it may be that the template is UTF-8 or UTF-16 and the HTML is in a more limited charset.)
If that doesn't work then you may need to look at your code and pay attention to how the raw bytes from the template are converted to Strings. For example:
String template = new String(bytesFromFile);
Would use the system default Charset, which might be different from the file. The safe way to convert the bytes to the String is to specify the character set:
String template = new String(bytesFromFile, "UTF-8");
<? xml version="1.0" encoding="utf-8"?>
As per W3C standards we have to use utf-8 encoding, Why can not we use utf-16 or any other which are in the encoding format?
Whats the difference between utf-8 encoding and rest of the other encoding formats.
XHTML doesn't require UTF-8 encoding. As explained in this section of the specification, any character encoding can be given -- but the default is UTF-8 or UTF-16.
According to w3 school there are lots of character encodings to help browser to understand.
UTF-8 - Character encoding for Unicode
ISO-8859-1 - Character encoding for the Latin alphabet.
There are several ways to specify which character encoding is used in the document. First, the web server can include the character encoding or "charset" in the Hypertext Transfer Protocol (HTTP) Content-Type header, which would typically look like this:[1]
charset=ISO-8859-4
This method gives the HTTP server a convenient way to alter document's encoding according to content negotiation; certain HTTP server software can do it, for example Apache with the module mod_charset_lite.[2]
We need our web application to handle additional characters - and so need to move from ISO-8859-1 to UTF-8. So my q is UTF-8 backwards compatible with ISO-8859-1?
I have made the following changes, and can now handle all characters, but want to make sure there's no edge cases I'm missing.
Changed Content-Type:
from "text/html; charset=ISO-8859-1"
to "text/html; charset=UTF-8"
Tomcat Connector URIEncoding from ISO-8859-1 to UTF-8
Thanks
is UTF-8 backwards compatible with ISO-8859-1?
Unicode is a superset of the code points contained in ISO-8859-1 so all the "characters" can be represented in UTF-8 but how they map to byte values is different. There is overlap between the encoded values but it is not 100%.
In terms of serving content or processing forms submissions you are unlikely to have many issues.
It may mean a breaking change for URL handling. For example, for a parameter value naïve there would be two incompatible forms:
http://example.com/foo?p=na%EFve
http://example.com/foo?p=na%C3%AFve
This is only likely to be an issue if there are external applications relying on the old form.
I want to show Shift-jis characters but only when displaying it. Store in UTF-8 and show in Shift-jis, so what is the solution to do that in Smarty?
You cannot mix different charsets/encodings in the output to the browser. So you can either send UTF-8 OR Shift-jis.
You can use UTF-8 internally and in an outputfilter convert the complete output from UTF-8 to Shift-jis (using mb_convert_encoding).
Smarty is not (really) equipped to deal with charsets other than ASCII supersets (like Latin1, UTF-8) internally.
Is there a way to identify whether the browser encoding is set to/supports "UTF-8" from Javascript?
I want to send "UTF-8" or "English" letters based on browser setting transparently (i.e. without asking the User)
Edit: Sorry I was not very clear on the question. In a Browser the encoding is normally specified as Auto-Detect (or) Western (Windows/ISO-9959-1) (or) Unicode (UTF-8). If the user has set the default to Western then the characters I send are not readable. In this situation I want to inform the user to either set the encoding to "Auto Detect" (or) "UTF-8".
First off, UTF-8 is an encoding of the Unicode character set. English is a language. I assume you mean 'ASCII' (a character set and its encoding) instead of English.
Second, ASCII and UTF-8 overlap; any ASCII character is sent as exactly the same bits when sent as UTF-8. I'm pretty sure all modern browsers support UTF-8, and those that don't will probably just treat it as latin1 or cp1252 (both of which overlap ASCII) so it'll still work.
In other words, I wouldn't worry about it.
Just make sure to properly mark your documents as UTF-8, either in the HTTP headers or the meta tags.
I assume the length of the output (that you read back after outputting it) can tell you what happened (or, without JavaScript, use the Accept-Charset HTTP header, and assume the UTF-8 encoding is supported when Unicode is accepted).
But you'd better worry about sending the correct UTF-8 headers et cetera, and fallback scenarios for accessibility, rather than worrying about the current browsers' UTF-8 capabilities.