Is there any diffrence between edit an utf-8 ini file with notepad or VS2010? - utf-8

I have an application which needs to read an ini file as a predefine config. There are some Chinese characters in the ini file, so I chose utf-8 encoding. But when I edit the file with Windows notepad and change some of the Chinese characters, the application will flash close when I run it. But when I use VS2010 to change the same Chinese characters everything is OK.
Is there some difference between editing a utf-8 ini file with notepad and VS2010? I'd prefer to be able to use notepad to modify the ini file as it is the most convenient option and most users will already have access to it.
This is a reviewed version of previous question.

Related

In sqlplus how to change or specific file encoding to utf8

in sql plus I created a file :
>EDIT test.sql
the file encoding is ansi. I want to change it to ut8, I tried by save-as but it didnt work
how to change test.sql from ansi to utf8 in sql plus ?
If you want to change your default editor for sqlplus, you can do this:
logon to sqlplus
use "define" command to set the path to the new editor. I like TextPad (which handles utf8 automatically), so the command (for my machine anyway) would be:
define _editor="c:\Program Files\TextPad 7\TextPad.exe";
Now, the edit command will bring up the new editor.
For TextPad, you can change the encoding by doing the following:
Conversion: Conversion between various file formats and encodings can
be made using the Save As command on the File menu. The options for
encoding are ANSI, DOS, Unicode, Unicode (big endian) and UTF-8.
So after doing the steps above, edit will bring up Textpad, and using Save As you can save the file using various encodings.
This is one benefit of changing the default editor from Notepad.

Notepad++, Atom, encoding seems to be broken

I have to edit php (.inc) file which was created a long ago and I don't know which editor was used to create it. The Cyrillic letters in Notepad++ are shown like they were in wrong encoding:
In GitHub's Atom editor, Cyrillic letters are totally lost and replaced with the � character:
But in browser everything is displayed correctly! The same is true when using Windows Notepad. Why it is displayed incorrectly in code editors and is there a way to make it look normal?
P.S. OK the thought that I just can copy it from windows notepad and save in notepad++ only now came to me :D But still curious why this happened to code editors.
P.S.2 Problem is solved. Editors just didn't recognize the original encoding properly. When I changed it manually to Windows1251, everything became ok.
Atom's support for encoding isn't as mature as some other editors out there, as you have already discovered you can change the encoding in the bottom right hand corner and Atom will remember it, however there are some packages which help further:
Out of the box as you have discovered Encoding Selector which allows you to choose how Atom interprets the contents of the text file.
There is a package that automatically select encoding for you named Auto Encoding, however it does have some issues with certain types of file, you might find this isn't a problem.
Finally, there is my personal favorite, editor-settings, which allows you to set the encoding of all files of a specific language, with a specific file extension or or directory.
As an example if you wanted to configure all .inc files in a directory to use windows-1251 create a .editor-settings in the directory you are using and paste in the following:
encoding: utf-8
extensionConfig:
inc:
encoding: windows-1251

UTF-8 Tab Separated File not opening in Excel properly

I am generating tab separated .xls file from SAP MII using XSLT which has few french characters as headers/columns.
When I open this file in excel, all special characters are messed up and does not appear correctly.
Such as Description évènement
When I store this file as .txt and open it in notepad, everything is displayed correctly.
It seems that Excel is not opening this file in UTF-8 formay but opens by default in ASCII format.
How do I avoid that?
Soham
You have to open excel then import the CSV off the data tab. UTF-8 will be recognized during the import. That said, I do need the ability to do this by simply double clicking on the CSVs.

Why do Netbeans, Aptana Studio and Komodo Edit all not save in UTF-8?

I'm getting back into development and want to find a good editor for HTML5/JQuery.
Being able to save files in UTF-8 is important.
However, although I set my project in NetBeans 7.0 to encode in UTF-8, when I create a file in the project, then look at it in Notepad++, the file is encoded in ANSI and I have to manually set the encoding to UTF-8:
In Aptana Studio 3 I set the workspace to UTF-8 encoding, and my project inherits from that, but when I create a file in the project and look at it in Notepad++, it is encoded in ANSI and I have to change the encoding manually to UTF-8:
So I tried Komodo Edit 7 and in the file manually set the encoding to UTF-8, saved the file, looked at it in Notepad++ which said the file is in ANSI.
I notice in any of these editors if I put a German umlaut character in the file, then Notepad++ shows it as "ANSI as UTF-8" but I still have to manually change it to UTF-8 in Notepad++ where it will stay.
The reason I want an editor that saves in UTF-8 is I remember having a project a couple years ago which had German and French characters in the files and after they were viewed and saved in various editors, the characters would be replaced with garbage characters. The solution was to always initially set the encoding of the file to UTF-8.
I assumed that editors would be so far advanced now that if you specify that the files should be saved in UTF-8, that they actually save in UTF-8 in a way that is recognized by every modern text editor. Is this not the case? What am I not understanding about modern text editors and development environments in regard to UTF-8?
How can I get these editors to save their files in UTF-8 encoding?
A UTF-8 encoded file that only contains characters also present in the ASCII table (the first 128 Unicode characters, i.e. your basic alphanumeric characters) is indistinguishable from an ASCII/ANSI encoded file. My guess is that Notepad++ simply can't make the distinction (because there is none) and defaults to ANSI. You can see the difference when you include a character that is not in the ASCII table. By "ANSI as UTF-8" I can only guess that it means "this documents contains characters from the ANSI table (a.k.a. Latin-1) and is saved in UTF-8".
In other words, your IDEs are probably fine, the problem is with Notepad++.
Try a character like 漢字, that will result in a pretty unique UTF-8 byte sequence that's most certainly not ANSI.
From what I've seen on this topic, Notepad's UTF-8 equates to Notepad++'s UTF-8, which means with BOM included. If a file is saved with this encoding and opened in NetBeans, it will actually show a - character or the  characters for the BOM sequence (depending on whether the encoding for the project or IDE is set to UTF-8.) But if you save the file in Notepad++ encoded as "UTF-8 without BOM", and have either your project defined as UTF-8 or have your netbeans_default_options included with this -J-Dfile.encoding=UTF-8, you'll see what I think is UTF-8 as it should be. Unfortunately, if you try to edit this file in NetBeans without including characters that are outside of the ANSI code set, you see the behavior that you referred to in your question with the file having its encoding set to ANSI.
So in an attempt to make this a "sort-of" answer to your question, please remember that not all editor's concept of UTF-8 are the same. Notepad++ gives the most actual info on what the real encoding for a file is. I'd say that developing in either a Linux or Mac environment might be a possible good choice for making sure that localization is correct, but on Windows a decent workaround might be to just include a non-ANSI character in the file to insure it always get saved as a UTF-8 (non-BOM) file. This is all geared towards NetBeans dev by the way. I haven't tested this with the others, though I'm willing to bet that they will save the file correctly on a Windows machine if they have non-ANSI characters in them. Sorry for the kluge gang, but either way, I hope it helps someone struggling with this same issue.

TextPad and Unicode: full support?

I've got some UTF-8 files created in Mac, and when trying to open them using TextPad in Windows, I get the following warning:
WARNING: (file name) contains characters that do not exist in code
page 1252 (ANSI Latin 1). They will be converted to the system default
character, if you click OK.
Linux (GNOME gEdit) can open the same file without complaints. What does the above mean? I thought that TextPad had full UTF-8 support. Can I safely open and edit UTF-8 files using it without corrupting the file?
It seems that TextPad cannot handle characters outside windows-1252 (CP1252, here carrying the misnomer “ANSI Latin 1”). I tested it on Windows, opening a plain text file created on the same system, as UTF-8 encoded, both with and without BOM, with the same result. The program’s help does not seem to contain anything related to character encodings, and its tools for writing “international characters” are for Latin-1 characters only.
There are several text editors for Windows that can deal with UTF-8 (even Notepad can open a UTF-8 file, but it can hardly be recommended for serious editing). See Alan Wood’s collection of information on Unicode editors and word processors for Windows. (Personally, I like Notepad++ and BabelPad, which are both free.)
TextPad 8, the newest as of 2016-01-28, does finally properly support BMP Unicode. It's a paid upgrade, but so far has been working flawlessly for me.
TextPad ‘supports’ UTF-8 and UTF-16 documents only in as much as it will import and export them. But it still edits files as simple bytes, and not Unicode characters (using the ANSI code page, which is code page 1252 for Western European).
So unless the file happened to contain only characters that also exist in that code page, you will lose content. This rather defeats the point of Unicode.
Indeed, this was the issue that made me flee—to EmEditor, at the time, though now I would agree with the previous comments and recommend Notepad++. The era of paying for text editors is long gone.
Actually TextPad does support displaying Unicode code points granted they went about it the wrong way. In order to display the Unicode characters you have to choose Configure->Preferences and expand "Document Classes->Text->Font.
You need to choose a Unicode font AND set the Script to match. E.g. Arial Unicode MS with script CHINESE_BIG5.
However, this is a backward approach since the application should handle this when the user tells TextPad to open the file in Unicode or UTF-8. The built in Notepad application with MS Windows will detect the encoding automatically and display the glyphs correctly based upon the encoding.
I found a discussion on this in the Textpad forums:
http://forums.textpad.com/viewtopic.php?t=11019
While I have Notepad++, Textpad handles large files with ease while other editors I've tried, including Notepad++, either slow to a crawl or die. I'm currently trying to edit a 475MB file and Notepad++ is not up to the task.
Textpad Configure Menu --> Preferences --> Document Classes --> Default --> Default encoding --> UTF-8
Try the ANSI code set with File/Open, that should solve the problem in TextPad

Resources