UTF-8 malforphmed byte sequence error - utf-8

When I tried to validate a recent project, I got multiple errors which say "Malformed byte sequence: 92." What does this mean? It's got something to do with the charset UTF-8, but I can't find ANY information online about what this problem is or how to fix it. If anyone out there could give me a hand, that would be great!!! I only know HTML, CSS and a little JavaScript.

Okay, I figured it out. I was saving my projects as ANSI. There was a setting under the "File" tab in my code editor (Programmer's Notepad) entitled "Encoding". I changed that to UTF-8 and it cleared up the validation problems.

Related

How to check and solve "The byte stream was erroneous according to the character encoding that was declared" in Firefox?

this issue was stated in How to solve byte stream erroneous too. But no real solution was there. Especially no way to check it was named.
This error is shown in Firefox, but not in Chrome. So I am not sure if the issue is only with Firefox.
Unfortunately I do not get more details on that error in the console of Firefox.
Try this page for example. I only get the issue on product pages. I have verified that the template is UTF-8 with Linux command file and I have saved the template with Notepad++ explicied to UTF-8.
Since this did not worked, I like to check what causes this issue. Does anyone have a clue how to check and determine the chars / content causing this issue?
Thanks!

Cyrillic characters displayed wrong in the html source but correct in the browser

I looked for a solution to this but could not find an answer anywhere. I have a Question2Answer website in Cyrillic that displays the characters correctly in the browser, however, when I check the HTML source file, it looks like the text inside the question and the answer are displayed as &# numbers.
The characters on row 15 are not correctly displayed. As a result, when I try to edit a question/answer on my Android phone, the question or the answer is delivered with the encoding and it is not possible for the website users to edit their question/answer. (It still works on a computer but you can see that the characters are displayed wrong in the source file).
Please use this question on my website as a reference.
I tried to change the encoding via HtAccess:
AddDefaultCharset
UTF-8 DefaultLanguage bg-BG
However, this did not work. I am curious as to how to fix this problem. Any ideas or suggestions are more than welcome.

odd issue with funny characters in Joomla/ Jomsocial

I hope someone can help me with this issue.
For a few months (since last August) there has been an ongoing issue on my site with strange characters appearing all over the place - especially in user generated content.
I have searched and searched for answers but nothing ever seems to work, although the most pressing (in the blog component) has been resolved by setting JCE to validate HTML - which is does fine in the Blogging component (EasyBlog) but doesn't anywhere else (where it is less critical but still an issue).
Here is what I have done so far:
Checked the site from multiple machines, multiple browsers - no difference.
Checked the MySQL database and table collation - which are utf8_general_ci
Added AddDefaultCharset UTF-8 and AddCharset UTF-8 .php to the .htaccess files. I played about with these for ages and these two seemed to be the only combination which didn't crash the site.
Have checked the HTML headers and they definitely have the correct content encoding types (set to UTF-8)
I have tried different WSIWIG editors to no avail. Besides it is often in the code output where the characters appear - typically a A next to a »
I have tried a hack to force the connection script to UTF-8 but this causes the site to crash.
If anyone has any ideas at all as to what I can do still ... I'm all ears (please)
Many thanks in advance
If your server is running PHP 5.4+ I would suggest that you try the following solution described in the JCE forums:
In the Editor Global Configuration, set "Entity Encoding" to "UTF-8"
In the "Custom Configuration Variables" field, add:
keep_nbsp:0
The keep an eye out for the JCE 2.3.2 release which will address this issue.
Things to note:
anywhere the spurious â or  is occurring will have to be edited to remove the characters (once the changes above have been applied to JCE).
the problem is Joomla! 2.5.x's use of get_html_translation_table() which relies on default values and PHP 5.4 changed the default encoding parameter to UTF-8. Previously it defaulted to ISO-8859-1
For the core you could try and modify _decode() in /libraries/joomla/filter/input.php, look for the line (around 644):
$trans_tbl = get_html_translation_table(HTML_ENTITIES);
and change it to:
$trans_tbl = get_html_translation_table(HTML_ENTITIES, ENT_COMPAT, 'ISO-8859-1');

Joomla adds an empty space in some components

I have a problem I've noticed a while ago with a site I'm building.
I've been working with Joomla for a while now and I have never encountered such a problem.
On some components, like featured content, search, hwdvideoshare and some more, quotation marks seem to be added at the very top of the main content area. This causes an extra empty space that pushes the content down.
It's not really acceptable since I am designing a layout that has to be very precise.
Hopefully you guys can help me, I have tried everything.
Open the "index.php" file in the template with notepad++. Then from encoding choose "Convert to UTF8 without BOM", save and reload.
The issue was UTF8 with BOM.
However, it wasn't on the template file.
I started looking for what's in common in all the sites that added the empty space.
It was pagination. Converted the pagination file to UTF8 without BOM and works flawlessly now.

how to debug vb6 richtextbox not showing unicode (chinese) properly

I have a simple vb6 editor type application which has a richtextbox as the editor page. It allows users to key in stuff and the store it into a file which will keep all the text in RTF stored as CDATA in xml.
When you load back the file, it will read it off the xml and load back the rtf. We allow for unicode editing, but my problem is I have a user which is using Windows XP, and they have some problems reading the chinese characters. They show up as gibberish in their pc.
It displays fine in both mine and a coworker's. I've already checked that they have the proper regional language and settings in their system. The install files for east asian language is already checked. And they can see chinese words on websites and even type them out.
I feel like I'm missing something here but I'm at a lost on what to check next? Any ideas on what I could test or check next?
my bad for the poor description skills, if anything is not clear just ask me.
thanks.
~steve
That is weird. Try confirming that your user have the same version of RICHTXT32.OCX ?
Could be a problem with font?
Try using font that supports unicode characters (Arial Unicode).
Or try going to a website with chinese characters and paste it into richtextbox, save it to a file and try loading it from the file.
Does that work?
well they should because i packed the app in vs installer setup package.
and for fonts, it's sim sun, and i've already checked with the users that they do have the sim sun fonts under window/fonts.
Btw i've already updated that the data is actually stored in xml under CDATA, although the rtf chunk is kept as it is.
okie, this seems to be the solution although i don't know why. in my msi setup file i've included the riched.dll so when i installed it in, the dll acts up and screw up my chinese character in the richtext control.
but when i repack to exclude that dll file and reinstall using that setup, it seems to work now...

Resources