UTF8 Character Malformed in CKEditor - utf-8

When trying to save:
𝐼
in CKEditor with:
𝐼
it gets converted to:
𝐼
if the source is viewed. If the source is then closed out this gets converted to:
��
or as source data:
��
I tried another solution to set the character set on the library call but that hasn't resolved the issue.
<script charset="utf-8" src="/lib/CKEditor/ckeditor.js>
Is there another solution to this?

Related

Encoding problem if I put my code in module or other ps1 file

My code was working well with special chars. I could use Write-Host "é" without any issue.
And then I moved some of my functions to an other PS1 file that I "dot sourced" (using Import-Module does the same), and I got encoding errors : prénom became prénom
I don't understand anything about encoding. VS Code doesn't allow me to change the encoding of a file. It has a parameter to set the default encoding but its defaulted on UTF8 and when I set Windows1252 it changes nothing. If I use Geany to update the encoding to Windows1252 it works... until I save the file again with VS Code.
Everything was working well when all my code was in the same file. Why would creating this second .ps1 file (which I created from the Windows Explorer) be a problem?
Working on Windows 10, in french, with VS Code 1.50.
Thank you in advance

Custom browser protocol to open IE with params

I need to implement something similar to this answer
https://stackoverflow.com/a/41749105/1004374
but I have several issues.
I changed it slightly so be able to pass arguments into the url:
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>openie</title>
</head>
<body>
<h1>Hello world!</h1>
Google1
Google2
</body>
</html>
and changed reg script:
Windows Registry Editor Version 5.00
[HKEY_CURRENT_USER\Software\Classes\openie]
"URL Protocol"="\"\""
#="\"URL:OPENIE Protocol\""
[HKEY_CURRENT_USER\Software\Classes\openie\DefaultIcon]
#="\"explorer.exe,1\""
[HKEY_CURRENT_USER\Software\Classes\openie\shell]
[HKEY_CURRENT_USER\Software\Classes\openie\shell\open]
[HKEY_CURRENT_USER\Software\Classes\openie\shell\open\command]
#="cmd /k set myvar= & call set myvar=\"%1\" & call set myvar=%%myvar:openie:=%% & call \"C:\\Program Files (x86)\\Internet Explorer\\iexplore.exe\" %%myvar%% & exit /B"
The only update is shielding of %1 argument:
myvar=\"%1\
This is needed to pass arguments with &. Otherwise will be copied url until first ampersand:
openie:https://www.google.com/?word=abc&word2=abc2
All is fine when you click the link first time. When IE is already opened url is copied incorrectly with encoded quotes inside it and automatically added http in the begining:
http://%22https//www.google.com/?word=abc&word2=abc2"
I realize that issue with cmd script inside but cannot guess what should be changed to be able to pass arguments and click links many times.
Not found a good way to modify the script to accept the '&'. but as a workaround, I suggest you could encode the url, and change the '&' to '%26', the link as below:
Google2
Then, in the destination page, you could decode the url and change '%26' to '&', then, split the string and get the parameters.
More details, please refer to the HTML URL Encoding.

Pandoc: [WARNING] Could not convert TeX math

I tried to convert html to docx by using Pandoc:
here is my html code:
<p> Example: ${v_1} = {\rm{ }}{v_2}$</p>
with MathJax config in head:
MathJax.Hub.Config({
extensions: ["tex2jax.js", "TeX/AMSmath.js", "TeX/AMSsymbols.js"],
jax: ["input/TeX", "output/HTML-CSS"],
tex2jax: {
inlineMath: [['$', '$'], ["\(", "\)"]],
displayMath: [['$$', '$$'], ["\[", "\]"]],
},
"HTML-CSS": {availableFonts: ["TeX"]}
});
Pandoc command that i used (Pandoc version 2.2.3.2):
pandoc -s test.html --mathjax -f html+tex_math_dollars --pdf-engine=xelatex -o xxx.docx
then i got a warning:
[WARNING] Could not convert TeX math '{v_1} = {\rm{ }}{v_2}', rendering as TeX:
{v_1} = {\rm{ }}{v_2}
^
unexpected "{"
expecting "%", "\\label", "\\nonumber" or whitespace
Someone please tell me how to fix this. Thanks!
Use the LaTeX \textrm instead of the plain tex \rm, and pandoc will be able to handle it.
Since 7k users have viewed this question since it was asked... perhaps others have made the same mistake I made as a novice RStudio user.
The first comment in both the README.md and the README.Rmd file is
<!-- README.md is generated from README.Rmd. Please edit that file -->
The intended meaning is (at least arguably) apparent if you pay sufficient attention to the this/that relative pronouns!
<!-- You should edit the README.Rmd file, not the README.md file -->
To repair the damage... I'm currently trying the suggestion to use an explicit devtools::build_readme() which I found in RStudio README.Rmd and README.md should be both staged use 'git commit --no-verify' to override this check
No luck yet ... but I feel like I'm (finally!) making forward progress on getting $\sqrt{x}$ to display properly in my github README!

Umlauts in filenames are truncated (are shown as question marks)

On one of our ColdFusion 10 enterprise / CentOS 6.5 servers umlauts in filenames are saved as ?.
For example:
<CFPROCESSINGDIRECTIVE pageencoding="UTF-8">
<CFSET VARIABLES.umlauts = "ümläüté" />
<CFSET VARIABLES.filename = createUUID() & "-" & VARIABLES.umlauts & ".txt" />
<CFFILE action="write" output="#VARIABLES.umlauts#" file="#expandpath("./" & VARIABLES.filename)#" />
<CFOUTPUT>#VARIABLES.filename#</CFOUTPUT> <!--- outputs something like: A9C9BC8C-983A-5EA6-A4ED411BA0E63C72-ümläüté.txt --->
writes a file called A8B49720-020A-2500-605F4CC73129D07C-?ml??t?.txt to disk. The content of the file is like expected "ümläüté".
Manual creating files with umlauts in filename is no problem (e.g. touch äöüß.txt works like expected).
More details of server:
Java Version: 1.6.0_29
Tomcat Version: 7.0.23.0
Java File Encoding: UTF8
$ cat /etc/sysconfig/i18n
LANG="en_US.UTF-8"
$ locale
LANG=de_DE.UTF-8
LC_CTYPE="de_DE.UTF-8"
LC_NUMERIC="de_DE.UTF-8"
LC_TIME="de_DE.UTF-8"
LC_COLLATE="de_DE.UTF-8"
LC_MONETARY="de_DE.UTF-8"
LC_MESSAGES="de_DE.UTF-8"
LC_PAPER="de_DE.UTF-8"
LC_NAME="de_DE.UTF-8"
LC_ADDRESS="de_DE.UTF-8"
LC_TELEPHONE="de_DE.UTF-8"
LC_MEASUREMENT="de_DE.UTF-8"
LC_IDENTIFICATION="de_DE.UTF-8"
LC_ALL=
Any ideas what could cause this behaviour?
I'll put it out as an answer for more clear visibility.
A user of Open Blue Dragon (an alternative CFML Engine) was having exactly the same issue.
If I try to upload a file with, for example, the filename "testätest.pdf", then I have the following situation:
The file, OpenBD stores to my filesystem, is named: test?test.pdf
The filename, reported via #cffile.ServerFile# is: testätest.pdf
He later came back with this answer
It seems like this has been resolved by setting "LC_ALL=en_US.UTF-8". It seems to be a tomcat problem that it sets question marks for special characters if the charset is unknown.
Or, in the OP's case, to set LC_All to "de_DE.UTF-8" perhaps.
Source: Issue 516: Special characters (like german "Umlauts") in filenames of uploaded files are replaced with "?"

UTF-8 i18n file

I'm trying to add a Chinese localisation to a scaffolded Yesod site. I have a zh.msg message file saved as UTF-8 format using Notepad in Windows, but when I run cabal install in the project directory, I get this:
Handler\Home.hs:15:11:
Not in scope: data constructor `MsgHello'
Perhaps you meant `Msg<stderr>: hPutChar: invalid argument (invalid character)
The line in question is where I render my homepage:
$(widgetFile "homepage")
I changed both message files to be Unicode formatted instead of UTF-8, and get this message instead:
Foundation.hs:1:1:
Exception when trying to run compile-time code:
Cannot decode byte '\xff': Data.Text.Encoding.Fusion.streamUtf8: invalid UTF-8 stream
So I guess UTF-8 is the way to go... somehow.
(I'm using Notepad because I haven't set up gVim to render Unicode characters. It's apparently a bit of a feat.)
When I went to commit my changes I discovered the issue. The diff for my English file looked like this:
-Hello: Hello
+<U+FEFF>Hello: Hello
I guess notepad added the character in, and it was working its way into the Haskell code. I solved it using vim according to this answer.

Resources