Typescript compiler doesn't accept cyrillic - compilation

I'm using JetBrains WebStorm and I have that problem.
If i type in the .ts file, for example:
var test: string = 'тест';
it becomes:
var test = 'пїЅпїЅпїЅпїЅ';
Is there a way, to compile the TypeScript files without loosing the non-latin strings.

TypeScript compiler accept cyrillic just fine. You should setup your editor to use UTF8 encoding for your files:
https://blog.jetbrains.com/idea/2013/03/use-the-utf-8-luke-file-encodings-in-intellij-idea/

Related

Visual Studio 2019 does not properly convert UTF-8 strings to UTF-16 strings in source files without BOM

I have the following source file (encoded in UTF-8 without BOM, displayed fine in the Source Code Editor):
#include <Windows.h>
int main()
{
MessageBoxW(0, L"Umlaute ÄÖÜ, 🙂", nullptr, 0);
return 0;
}
When running the program, the special characters (Umlaute and Emoji) are messed up in the Message Box.
However, if I save the source file manually as "UTF-8 with BOM", Visual Studio will properly convert the string to UTF-16 and when running the program, the special characters are displayed in the Message Box. But it would be annoying to convert every single file to UTF-8 with BOM. (Also, I think GCC for example does not like BOM?)
Why is Visual Studio messing up my string, if there is no BOM in the source file? The Auto-detect UTF-8 encoding without signature option is already enabled.
I tested the same source with MinGW-w64 and don't have the issue, regardless if there is a BOM or not.
Use the /utf-8 compiler switch. The MS compiler assumes a legacy ANSI encoding (Windows-1252 on US and Western European versions of Windows) if no BOM is found in the source file.

Transcoding with Gradle

My build.gradle file generates UNIX and Windows launcher scripts for my Java project from templates.
Templates are UTF-8 encoded and the generated scripts are UTF-8 too. It's not a problem on Linux where UTF-8 support is ubiquitous, but Windows has some issues displaying non Latin-1 characters in cmd.exe terminal window. After reading Using UTF-8 Encoding (CHCP 65001) in Command Prompt / Windows Powershell (Windows 10) I come to a conclusion that converting the generated UTF-8 script to cp1250 (in my case) would save me lots of trouble when displaying hungarian text. However I couldn't figure out how to convert a UTF-8 file to other code page (looked at copy, but didn't find a way to specify output encoding.)
Simply use FileUtils from Apache Commons IO in your build file.
import org.apache.commons.io.FileUtils
buildscript {
repositories {
mavenCentral()
}
dependencies {
classpath("commons-io:commons-io:2.8.0")
}
}
And then, in the relevant part of the script, where launcher scripts are generated :
File f = file('/path/to/windows-launcher')
// Reading the content as UTF-8
String content = FileUtils.readFileToString(f, 'UTF-8')
// Rewriting the file as cp1250
FileUtils.write(f, content, "cp1250")

Encoding problem if I put my code in module or other ps1 file

My code was working well with special chars. I could use Write-Host "é" without any issue.
And then I moved some of my functions to an other PS1 file that I "dot sourced" (using Import-Module does the same), and I got encoding errors : prénom became prénom
I don't understand anything about encoding. VS Code doesn't allow me to change the encoding of a file. It has a parameter to set the default encoding but its defaulted on UTF8 and when I set Windows1252 it changes nothing. If I use Geany to update the encoding to Windows1252 it works... until I save the file again with VS Code.
Everything was working well when all my code was in the same file. Why would creating this second .ps1 file (which I created from the Windows Explorer) be a problem?
Working on Windows 10, in french, with VS Code 1.50.
Thank you in advance

How to automatic search and replace for 18n.properties file with WebStorm

For SAPUI5 there are i18n.properties files.
For the German language I need to replace the special German chars with the unicode codes.
# AE = \u00C4, ae = \u00E4
# OE = \u00D6, oe = \u00F6
# UE = \u00DC, ue = \u00FC
# SZ = \u00DF
How can I automate this search and replace with WebStorm?
You could just use WebStorms 'Replace in Path' (CMD+SHIFT+R on Mac) on your i18n folder. IntelliJ IDEA has better editing support for .properties files though (since they are coming from java)
Will be also easy to do this via a node script/bash script/gulp task whatsoever.
Btw: Is this really needed? Having all .properties files in UTF-8 should just do the trick. Afaik only Tomcat got confused by that since in the Java spec these files are ISO-8859-1 by definition. As long as you are deploying to a platform that accepts them as UTF-8 there shouldn't be an issue.
BR
Chris
PS: That code looks really familiar ;D

Disable encoding checking in java gradle project

I want to migrate one of our java projects from ant to gradle. This project has got a lot of source code wrote by few programmers. The problem is that some of files are encoded in ANSi and some in UTF-8 (this generates compile errors). I know that I can set encoding using compileJava.options.encoding = 'UTF-8' but this will not work (not all files are encoded in UTF-8). Is it possible to disable encoding checking (I don't want to change encoding of all files)?
This is not an issue with Gradle but with javac. However, you can solve this issue running a one-time groovy script in your gradle build as described below.
Normally you'd only need to add following line to your build.gradle file:
compileJava.options.encoding = 'UTF-8'
However, some text editors when saving files to UTF-8 will generate a byte order mark (BOM) header at the beginning of the text files.
And javac does not understand the BOM, not even when you compile with encoding="UTF-8" option so you're probably getting an error such as this:
> javac -encoding UTF8 Test.java
Test.java:1: error: illegal character: \65279
?class Test {
You need to strip the BOM from your source files or convert your source file to another encoding. Notepad++ for example can convert the file encoding from one to another.
For lots of source files you can easily write a simple task in Groovy/Gradle to open your source text files and convert the UTF-8 removing the BOM prefix from the first line if found.
Add this to your build.gradle and run gradle convertSource
task convertSource << {
// convert sources files in source set to normalized text format
sourceSets.main.java.each { file ->
// read first "raw" line via BufferedReader
def r = new BufferedReader(new FileReader(file))
String s = r.readLine()
r.close()
// get entire file normalized
String text = file.text
// get first "normalized" line
String normalizedLine = new StringReader(text).readLine()
if (s != normalizedLine) {
println "rename: $file"
File target = new File(file.getParentFile(), file.getName() + '.bak')
if (!target.exists()) {
if (file.renameTo(target))
file.setText(text)
else
println "failed to rename or target already exists"
}
}
}
} // end task
The convertSource task will simply enumerate all of the source files, read first "raw" line from each source file then read the normalized text lines and compare first lines. If the first line is different then it would output a new target file with the normalized text and save backup of the original source. Only need to run convertSource task one-time after which you can remove original source files and the compile should work without getting encoding errors.

Resources