Windows Raster Fonts Encoding Error - windows

I'm writing an interpreter and I have come across a peculiar problem involving character sets. ( I think ).
When I create a file on my Mac called, hello.rd and I run the command;
file -I hello.rd
I get this output:
hello.rd: text/plain; charset=utf-8
That shows me the file is UTF-8 which it should be. The source file looks like this;
print "Hello World á"
And the output in the terminal is:
Hello World á
This is all the way I want / expect it to be. The problem arises when I execute the code on Windows. When I execute the same code on Windows I get this output:
As you can see the á isn't output correctly. I changed the codepage to 65001 and it made no difference, but when I used the Lucida Console font, the characters displayed correctly. But what I can't understand is, why I can type the letter á in the terminal using my keyboard and it displays, but it won't display from my files.
So what I did next was I created a file on my Windows PC called test123.rd and saved this text in it:
print "Hello World á ã ß"
When I execute that on my Mac I get the incorrect output this time, I get:
Hello World ? ? ?
And on my PC I still get the incorrect output, I get this:
I used the file -I command on my Mac on the file test123.rd and I got this output:
test123.rd: text/plain; charset=iso-8859-1
I assume since the character set in the test123.rd file isn't UTF-8, is why the file test123.rd is displaying incorrectly on OSX but I don't understand why it's displaying incorrectly on Windows as well.
Does anyone have any idea how to solve the problem, without changing the font of the Windows CMD?

Type cmd /? to see how to switch unicode on, then choose a unicode font. Also see chcp /?.

Related

Can I redirect macOS lpr output to a PDF instead of to a printer?

I am looking for a method of sending the output of macOS lpr to a PDF file instead of to a printer.
When I use the lpr command in macOS to print a text file to a printer, it uses the Menlo font for the text, so, clearly, it is not simply sending raw text but applying formatting. I am trying to figure out how to redirect that formatted output to a PDF file instead of to a printer.
I've tried piping the output to pstopdf, but even if that's a possible solution, I can't make it work.
I need to make this a portable solution, so that I can distribute it as part of an app, and therefore I can't require someone else to install brew or any other software on their system. This means I can't use something like enscript which isn't native to macOS to convert a text file to PostScript, then convert that to PDF and print it. I've tried using nenscript but it does strange things to formatting and doesn't produce the correct output the way the lpr command does.
Can anyone suggest a way to get the lpr output into a file?
The answer is:
cupsfilter -i text/plain inputfile > outputfile.pdf

What is the Windows command line parameter encoding?

What encoding does Windows use for command line parameters passed to programs started in a cmd.exe window?
The encoding of command line parameters doesn't seem to be affected by the console code page set using chcp (I set it to UTF-8, code page 65001 and use the Lucida Console font.)
If I paste an EN DASH, encoded as hex E28093, from a UTF-8 file into a command line, it is displayed correctly in the cmd.exe window. However, it seems to be translated to a hex 96 (an ANSI representation) when it is passed to the program. If I paste Cyrillic characters into a command line, they are also displayed correctly, but appear in the program as question marks (hex 3F.)
If I copy a command line and paste it into a text file, the resulting file is UTF-8; it contains the same encoding of the EN DASH and Cyrillic characters as the source file.
It appears the characters pasted into the cmd.exe window are captured and displayed using the code page selected with chcp, but some ANSI code page is used to translate the characters into a different encoding before passing them as parameters to a program. Characters that cannot be converted apparently are silently converted to question marks.
So, if I want to correctly handle command line parameters in a program, I need to know exactly what the encoding of the parameters is. For example, if I wish to compare command line parameters with known UTF-8 data read from a file, I need to convert the parameters from the correct encoding to UTF-8. Thanks.
If your goal is to compare Unicode characters then you should call GetCommandLineW in your program (or use wmain so that argv uses wchar_t) and then convert this UTF-16LE command line string to UTF-8 or vice versa.
GetCommandLineA probably converts the Unicode source string with CP_ACP.

Perl on Windows: Problems with Encoding

I have a problem with my Perl scripts. In UNIX-like systems it prints out all Unicode characters like ä properly to the console. In the Windows commandline, the characters are broken to senseless glyphs. Is there a simple way to avoid this? I'm using use utf8;.
Thanks in advance.
use utf8; simply tells Perl your source is encoded using UTF-8.
It's not working on unix either. There are some strings that won't print properly (print chr(0xE9);), and most that do will print a "Wide character" warning (print chr(0x2660);). You need decode your inputs and encode your outputs.
In unix systems, that's usuaully
use open ':std', ':encoding(UTF-8)';
In Windows system, you'll need to use chcp to find the console's character page. (437 for me.)
use open ':std', ':encoding(cp437)'; # Encoding used by console
use open IO => ':encoding(cp1252)'; # Encoding used by files

Ruby and Accented Characters

Summary of the wall of text below: How can I display accented characters (so they work via puts, etc) in Ruby?
Hello! I am writing a program for my class which will display some sentences in Spanish. When I try to use accented characters in Ruby, they do not display correctly (in the NetBeans output window (which displays accented characters in Java fine) or in the Command Prompt).
At first, some of my code didn't even run because the accented characters in my arrays where throwing off the Ruby interrupter (I guess?). I got errors like Ruby was expecting a closing bracket.
But I did some research, and found a solution, to add the following line of code to the beginning of my Ruby file:
# coding: utf-8
In NetBeans, my program ran regardless of this line. But I needed to add this line to get my program to run successfully in Command Prompt. (I don't know why.)
I'm still, however, having a problem actually displaying the characters to the screen. A word such as "será" will display in the NetBeans output window as "seré". And in the command prompt it draws little pipe characters (that I don't know how to type).
Doing some more research, I heard about:
$KCODE = 'UTF-8'
but I'm not having any luck with this.
I'm using Ruby 1.8 and 1.9 (I go back and forth between different machines).
Thanks,
Derek
A command prompt in Windows 7 has raster fonts by default. And it doesn't support unicode. At first, you should change cmd font to Lucida Console or Consolas. And then change the command prompt's codepage with chcp 65001. You can do it manually or add this line to your ruby programm:
# encoding: utf-8
`chcp 65001` #change cmd encoding to unicode
puts 'será test '

Pasting functions from system clipboard to gVIM

The following is the contents of the Windows System Clipboard
:function CurrentLineLength
: len = strlen(getline("."))
: return len
:endfunction
I hit the colon and then control r
I then hit shift 8 to paste the contents of the system clipboard.
I hit return and vim comes back with
E488: Trailing Characters
I see some ^M characters in there and removing them does not help. I do know that I can paste the functions into a .vim file and read them that way so its not crippling but as I work through some examples of vim script this would be nice to have.
Is there something special about how functions are entered in or is it possible to paste them from the system clipboard?
Thanks!
I'm not sure about pasting multiple lines to command mode, but you can achieve the same thing by simply putting the function in a register and executing the register (same as a macro).
Also, Vim doesn't seem to like that function as you've pasted it, I've made a couple of changes below. If you copy the below to the system clipboard and then press #* from normal mode, it works.
:function CurrentLineLength()
: let len = strlen(getline("."))
: return len
:endfunction
Vim should not have any problems with carriage returns in command mode (that's what the ^M characters are). I would guess that there are some other characters in the code you're pasting - this is quite possibly the problem if you're pasting from a web page. Try putting the contents of your clipboard into a file and see if it's really what you expect it to be (including all whitespace characters).

Resources