I have a solution on my Visual Studio and my program's language is Brazillian Portuguese.
Everytime I compile it and execute and it simply doesn't show the characters I wrote.
Example:
int main (void) {
printf("áéíóúàèìòù");
return 0;
}
It simply shows something really strange.
Although, I had tested another time taking the output to a file and it showed the right output, so I think the problem might be in the cmd.
Then, I searched what might be causing the problem and the results were hanging basically on the code page cmd used.
I finally used chcp 1252, but it seems it doens't work with me, so here I am. Does anyone know what code page should I use or what I can do to the source file to it show the right output? Thanks in advance.
I'm assuming C++.
The reason is that the file is saved with UTF-8 encoding, and the string literals are treated as a sequence of bytes.
So if you have "é" in your source code, it's treated as "\c9\a9" and it gets displayed in CP-437 (default Western encoding for Windows Command Prompt) as ├⌐
Solution: either:
save your source files in some 8-bit encoding (for example CP-1252), change the default encoding in VS, and set the terminal to use the same encoding,
or change your terminal to something that support UTF-8, like Cygwin.
Related
I wrote a script with German special characters e.g. ü.
However, whenever I close R and reopen the script the characters are substituted:
Before "für"; "hinzufügen"; "Ø" - After "für"; "hinzufügen"; "Ã".
I tried to remedy it using save with encoding and choosing UTF-8 as it is stated here but it did not work.
What am I missing?
You don't say what OS you're using, but this kind of thing really only happens on Windows nowadays, so I'll assume that.
The problem is that Windows has a local encoding that is not UTF-8. It is commonly something like Latin1 in English-speaking countries. I'm not sure what encoding people use in German-speaking countries, if that's where you are. From the junk you saw, it looks as though you saved the file in UTF-8, then read it using your local encoding. The encodings for writing and reading have to match if you want things to work.
In RStudio you can try "Reopen with encoding..." and specify UTF-8, and you'll probably get your original back, as long as you haven't saved it after the bad read. If you did that, you've got a much harder cleanup to do.
So... I was testing jGrasp and when i openned my testing file I saw something like this:
¿Khà ?
instead of this:
¿Khà?
but when i compile it, first i got the weird characters (the encoding was wrong). So i changed the encoding on the WorkSpace>Charset (the default, I/O and cygwin) to UTF-8 and got the correct output (like in the second image)... but it still looks the same on jGrasp.
If I change it on jGrasp so it looks "good", on other text editors will look diferent (and also in the compiler).
EDIT
I have found a few other encodings that work, but they aren't UTF-8, and also i don 't want to be changing every moment the encoding.
I'm not clear on exactly what the problem is, but if you need to open and/or edit a single file with a specific encoding different from the default, use "File" > "Open" and specify the charset on the dialog. The charset choice will be remembered.
I created an ordinary text file on Windows 7 64-bit using gnu emacs 23.3.1. I can edit the file with other programs such as LinqPad (the file happens to be a linqpad script, extension .linq). Everything is fine until I put a Unicode character in the file, a character such as the greek letter λ (lambda). I can input the letter in emacs and it displays correctly. However, emacs refuses to save the file, reporting the following error
Failure in loading charset map: 8859-7
If I input the λ in LinqPad, emacs will read and display them, but will not save the file.
I just noticed that Notepad++ has other unexpected behavior with this file: it does not display the λ's, but instead pairs of odd characters such as λ. That is fitting to an untuition (pun intended) that the unicode chars are being stored as pairs. So it looks like this is a kind of ambiguous situation (storing unicode in text files), but it also looks like linqPad and visual studio "do the obvious thing."
I want to use emacs because it's the only program that I have that reflows sequences of commented lines (lines after //, reflows them with Alt-Q), and I want to use greek characters in my comments because I'm describing a mathematical program.
I'll be grateful for advice and answers.
UPDATE: some advice in other questions said to try M-x describe-char, also bound to C-x = ; both of those give me the same failure message as above, so they're on the right track, just not answers.
This once happened to me when I had upgraded all packages (including Emacs) without realising I still had an Emacs session open during the upgrade. Next time I asked it to save some Unicode, it tried to load 8859-7 and failed because the path was different in the upgraded version. I had to redo the edit after restarting Emacs.
I just noticed that Notepad++ has other unexpected behavior with this file: it does not display the λs, but instead pairs of odd characters such as λ.
λ is what you get when you interpret the byte sequence 0xCE, 0xBB using the encoding ISO-8859-1, or Windows code page 1252 (Western European). Code page 1252 is probably the default (‘ANSI’) code page on your machine.
0xCE, 0xBB is the UTF-8 encoding of the character λ (U+03BB Greek small letter lambda). So to display it correctly you need to tell your text editor that the file is saved in UTF-8 and not ANSI.
In Notepad++, choose UTF-8 from the menu bar ‘Encoding’ entry.
In Emacs, C-x C-m c utf-8-dos (or unix or whatever) as a prefix to opening or saving the file. Hopefully by saving in UTF-8 you'll avoid whatever the problem is with the ISO 8859-7 (Greek) map; you certainly don't want to be saving any files in 8859-7, or indeed anything but UTF-8, if you can help it.
Take this text for example:
the three umlauts are ä, ö, and ü..
Let's assume they are in a text file, which I'm reading like this:
data = File.read("umlauts.txt")
Now, if I try to output them, I get this:
the three umlauts are Σ, ÷, and ⁿ.
If I write it to a file, they get outputted correctly. How can I make them show up properly on a windows command prompt? I'm using Ruby 1.8.6. I want to be able to perform quick debug from the command prompt.
What encoding is the file? I'm guessing probably utf-8. Windows cmd prompt does not use utf-8.
Here's a good article that covers this: http://illegalargumentexception.blogspot.com/2009/04/i18n-unicode-at-windows-command-prompt.html
Maybe set a different code page for cmd?
For explanations on encodings, read this.
I've sort of got Fsi.exe working as expected on a Mac OS X (Snow Leopard) with Mono. I just noticed a little bit of odd behavior with cut and paste and I was wondering if anyone had seen this.
I've defined the following alias for fsi:
alias fsi='ledit mono ~/FSharp-1.9.7.8_2/bin/fsi.exe --gui-'
ledit is an Ocaml utility that seems to make the keyboard input work correctly--without it, fsi just never seems to read the input. To see what I mean, try Fsi.exe without ledit and enter
let square x = x * x;;
without ledit, it just never seems to parse the input. I mean it never comes back to the ">" prompt after you enter the string. With ledit, the ">" prompt comes back immediately.
Of course the --gui- keeps fsi from displaying all the messages about the lack of System.Drawing etc.
So this all seems to work. The oddity is when I copy and paste code into the FSI, certain characters seem to repeat over and over again. It seems to be conditioned by the size of the buffer I'm pasting in. When I paste small snippets there seems to be no problem. But if I paste in larger chunks, there's this oddity.
If I do the following:
open System.IO;;
then paste this code snippet in FSI:
let buildFileList basepath filespec =
seq {
yield! Directory.GetFiles(basepath, filespec, System.IO.SearchOption.AllDirectories)
}
That works fine. But if I copy and paste in a bigger chunk of code ending with that, it repeats the portion up to the yield! over and over again. It seems to be somehow related to fsi attempting to parse the code as it's being pasted in because the same code pasted in will cause parsing errors (like FS0010) when it's pasted at the end of a long chunk but won't cause an error when it's isolated.
If I #load the entire file, it parses correctly as well so I think my code is ok.
This oddity in copy/paste seems to happen both with and without ledit on the command line. I don't mind researching this issue myself but I'm kind of stumped about where I should proceed with this. I'm copying from GVim if that makes a difference but anyone have any idea where I might proceed in trying to isolate the cause of this odd behavior? I suppose I could take the extra step of copying into TextEdit first and then trying to copy into fsi but any ideas beyond that?
To bottom line this: has anyone else seen this odd behavior? If not, any suggestions about how I might proceed in trying to isolate the cause of this odd behavior?
When I encountered this behavior on my Mac, I went a different route. Instead of using ledit, I employed fsi's --readline option, seen below (where ${FSHARP} is my install path).
mono ${FSHARP}/fsi.exe --readline+ --gui-
You may also want to check your terminal settings. My terminal (for example) is declared as xterm-color, and I have unchecked delete sends CTRL-H. I think those are the only relevant settings, but don't hold me to it.