Every time I'm saving a file that has some Unicode characters in
notepad, it prompts me that this file is going to be saved in ansi
format and you will losing some data and I should cancel saving
and choose UTF8 as unicode.
How can I set default encoding to UTF8 so it will not prompt me every time?
thanks in advance.
In windows 10, get to
Control Panel > Region > Tab Administrative
Hit button "Change system".
Then choose the language you use from the combobox labeled "Current system locale".
And check the checkbox labeled "Beta: Use Unicode UTF-8 for worldwide language s".
Hit the ok button.
Short answer - Notepad simply does not support what you are asking for. It will always default to ANSI, you have to tell it explicitly not to use ANSI. However, there are alternatives available, see Changing the default ANSI to UTF-8 in Notepad on SuperUser.
Related
I used vim to open a file event.txt and show me some search results. This worked fine, but since I did a change in the _vimrc it displays the file content with #-signs so that it is unreadable. See image below.
What I did change in _vimrc was
set fileencoding=utf-8
but I commented it. So it should not affect vim.
"set fileencoding=utf-8
The file is still displayed unreadable. With other editors I can open the file and view it normally. I had this behaviour some time ago, but I vanished somehow. I can't remember.
The event.txt file is the windows event file which I generate through the powershell:
get-eventlog -logname system > event.txt
Something tells me it's not the change in the _vimrc and perhaps something else, but this is the last change I remeber I did and after this it did not work.
How can I view in vim the windows event file event.txt normaly? Without #-signs.
That ÿþ at the beginning is a byte order mark (BOM), typical for Windows Unicode text. The ^# is Vim's representation of a NUL value, and it (roughly) appears as every second character. So, you have a (mostly) ASCII-text file, encoded in UCS-2 little endian: each character is represented by two bytes (16 bit), the lower one comes first.
You can open that file with
:edit ++enc=ucs2-le event.txt
But it's better to set up Vim correctly so that it automatically detects it. Since you're using GVIM on Windows, I would recommend to put
:set encoding=utf-8
at the start of your ~/.vimrc. This will automatically set your 'fileencodings' to a good default of ucs-bom,utf-8,default,latin1. Note the first element; that should help detect the file.
Do not set 'fileencoding' in your ~/.vimrc! That is a buffer-local setting, and it will be automatically set by Vim on opening of the file. The 'fileencodings' (note the plural) is the right option to influence the detection.
when going in settings> preferences> new document, which language should I chose to create Unix scripts (*.sh *.bsh).
I know that a solution is to chose the format when saving, but it's kinda annoying...
thanks!
Format: Unix/OSX
Default language: Shell
Encoding: UTF-8 without BOM
At the bottom of Notepad++ -status bar- you will see that 7th and 8th columns are describing the format of the file you are editing.
double click on the 7th one and select "Unix(LF)"
for the 8th one Go To: Encoding -> Encode in UTF-8
I have a uni-code string in Arabic to display in output window rather than in console, so I could only use OutputDebugStringW, and I call SetConsoleOutputCP(1256) to set Arabic code page but still it only output "????". What should I do...
This is a documented restriction for OutputDebugStringW():
OutputDebugStringW converts the specified string based on the current system locale information and passes it to OutputDebugStringA to be displayed. As a result, some Unicode characters may not be displayed correctly.
Calling SetConsoleOutputCP() doesn't solve the problem, that changes the code page for the console window, not the debugger. You'd have to change your system locale, Control Panel + Region, Administrative tab. If Arabic is your favorite language then changing it to 1256 is the appropriate thing to do. It will of course have system-wide effects.
In the Open with menu of a .cs file there's Csharp editor and Csharp editor with encoding. I opened a solution with both and didn't see a difference.
What's the difference between them?
Unless your .cs file includes characters outside of the normal ASCII range, you won't see a difference in the actual contents of the file. The difference is whether or not the editor tries to detect the character encoding you saved your file with when you open it again, or asks you specifically.
By default, when you save a new .cs file, VS uses the current ANSI code page to encode the characters. (You can switch this to use UTF-8 by default with the appropriate options.) However, you can instead chose to "Save with Encoding...", which will prompt you for the specific character encoding you want to save it.
Internally, your code is being handled as UTF-16, since that's what Windows deals with as it's native string format. On-disk, however, UTF-16 would most likely blow up your source files to double their size, since most of the C# code you write probably fits into a single byte. So, when writing to disk, VS writes out your data in a particular code page that defines how to convert the UTF-16 characters into some other, possibly 8-bit character set.
When you reload a file in VS, it attempts to figure out what encoding that file was in, and if it can't, it will fall back on the current ANSI code page. (You can force it to fall back to UTF-8 via some options, but it won't ever fall back to a different encoding.)
When you reload a file "With Encoding", you get the same prompt as when you saved the file, asking you which encoding was used. This way, if Studio gets it wrong, you can fix it.
Unless you do a lot of internationalized programming, where you have foreign-language strings embedded in your .cs file from a language other than the default, you probably don't need to use the explicit "with encoding" save or loads. But, they are there if you need them.
If you open with encoding you can save with whatever character encoding is appropriate for your culture or region.
I get some text string from service, which contains Unicode control characters
(i.e \u202B or \u202A and others for Arabic language support).
But while debugging I can't see them in default text visualizer. So I need to enable display for such characters to determine which of them my text consists of. There is checkbox in text visualizer "show all characters", but it doesn't work as I expect.
Any suggestions?
Thanks in advance
Those are codes for explicit RLE and LRE order, ie if in RLE something should be displayed in LRE order.
http://unicode.org/reports/tr9/#Directional_Formatting_Codes