Most of the monospace fonts normally used for programming on Windows (all I have found so far) don't display 'funny quotes' (0xE2) properly. For example, an error message from the gcc compiler like
warning: conflicting types for built-in function âprintfâ
which is really
{funny quote}fname{funnyquote}
displays in Ludida Console, DejaVu, etc. as
{circumflex a}fname{circumflex a}
(It may be doing so for you right now). Is there a a helvetica-like monospace font that respects that particular 8-bit codepage?
Windows has a split personality. Most of it is based on Unicode, while some parts still rely on code page character translations.
The character you're getting for 0xE2 is the proper Unicode translation, and is the same in the very common code page 1252 and most of the rest as well. The only code page that has a quote for that value is code page 10000, Mac OS Roman.
The chcp command is used to change the code page of the command window, but I can't get it to work for your specific case.
Related
Not sure if SO is the best place for this question, but don't know where else to ask.
Is there any way to transform a svg like this one for ex: (https://svgsilh.com/image/1775543.html) into something that i can use inside an editor with copy/paste like this one? 🦄
No, because the unicorn emoticon is one example of a character. And just as with letters, digits, and punctuation, the appearance of emoticons and other plain-text symbols is decided by fonts.
LSerni wrote the following:
The reason you can "copy and paste" that icon is that the icon already has a UTF-8 code and your editor is UTF-8 aware. And this is why the same emoticon is slightly different between Apple, Android and so on: it's because it's always code XYZ, but code XYZ is rendered with different icons on different platforms.
But that's not entirely correct. The difference in rendering lies more in the font than in the operating system that displays emoticons. Unless the font supplies its own version of a symbol, that symbol will usually be supplied by the font specified by default by the operating system, and different operating systems supply different symbol fonts.
I would like to create a postscript or pdf figure with enhanced notations, italic or bold Latin characters, and sometimes (regular) Greek characters. How to do that in general?
Let's say I downloaded CMU Sans Serif, a font that has glyphs for all the strange characters I ever want to use. I converted them to pfa with an online tool and copied the files to the path of working directory.
Expectations
Let's say I'd like to produce the following notation somewhere.
What I tried: original
I create a gnuplot script encoded in a utf-8 file (without BOM) with the content
set term postscript eps enhanced "CMUSansSerif" 15 fontfile add 'CMUSansSerif.pfa' fontfile add 'CMUSansSerif-Oblique.pfa' fontfile add 'CMUSansSerif-Bold.pfa'
set encoding utf8
set o "print.eps"
p x t "Label: {/CMUSansSerif-Bold important }{/CMUSansSerif-Oblique note}: ∫⟨α₂ + β²⟩ = äßű"
set o
and executed with the newest gnuplot, version 5.2.6.
What I got
I used a vector graphics editor to open the eps file and relevant part looks like this:
What I also tried
According to Ethan's answer I added adobeglyphnames to the termoptions. It made at least the letters available but other Unicode symbols are still unavailable. The result is:
Question
What went wrong? How could I produce the desired output?
So many possibilities, where things can go wrong: Is the font not suitable for this task? Did I download a wrong version of it? Did the pfa converter do a bad job? Did I include the font files incorrectly? Was there something wrong with the set encoding? Do I use a bad vector graphics editor? Do I have wrong fonts installed and the vector graphics editor tries to use them?
I am afraid that the answer is that in general PostScript is the wrong tool for this. If it is at all possible for you to work with PDF output instead, I suggest you do that. It is even possible the resulting PDF file can be translated to a PostScript file by standard tools (e.g. pdf2ps). That is likely to work if the non-ascii characters are limited to Greek and other relatively common symbols but I don't know how much of the full unicode tables are covered by those standard tools.
If you really need to produce PostScript with additional unicode characters directly from gnuplot, you can find full instructions and sample character encoding tables in the gnuplot distribution files:
.../term/PostScript/unicode_maps.README
.../term/PostScript/unicode_big.map
.../term/PostScript/unicode_small.map
I am not familiar with the online tool font conversion you used but probably it failed because it did not have, or at any rate did not use, suitable character encoding tables for the desired conversion.
===
One other thought. There are two ways that a *.pfa font can encode unicode characters that are common enough to have a name assigned by Adobe for use in PostScript. (1) It may use generic names like uni0439 for Unicode code points. (2) It may use Adobe-specific names from the list here:
agl-aglfn glyph list
When selecting PostScript output from gnuplot you can tell it which of these two conventions is used by the font you provide. The default is "noadobeglyphnames".
set term postscript {no}adobeglyphnames
==
(recipe for using "set term pdfcairo")
Font handling is unfortunately system-specific, so I cannot tell you how to install or configure fonts on all your target machines. I will show you a procedure that works on a linux desktop that uses the fontconfig utilities for system font handling.
Create directory /home/share/fonts/CMUSans
Add this directory to the search list in file /etc/fonts/local.conf
Copy *.ttf files into this directory from the CMU Sans Serif zip archive you link to in your original query. The system fontconfig system tools should now be able to find these fonts. By inspection they self-report as "CMU Sans Serif"
in gnuplot (tested with version 5.2.6)
set term pdfcairo font "CMU Sans Serif,15"
set output 'enhanced_utf8.pdf'
load 'enhanced_utf8.dem'
convert output pdf file to PostScript with the following command
pdf2ps enhanced_utf8.pdf enhanced_utf8.ps
Screenshot of the result is shown below
It seems that CMU Sans Serif doesn't contain the UTF-8 characters you are asking for. Check the font with a font editor like Birdfont. Although the webpage shows symbols you want to use, the font itself does not contain them. However, your browser may show symbols, but they are just fallback representations from other fonts.
I have found this letter / character in facebook, but how can this have a color? is just insane for me, look this: ✔️
Added image (From Firefox on windows)
It's not an ASCII character, it's likely an emoji. Emoji are part of Unicode and the actual glyph displayed to the user is open to interpretation by the platform displaying it. The spec suggests a name/description, but the implementation varies.
So while you may see a colored check mark, I see black & white. Other times, a single glyph will have multiple styles made available on a particular platform; for example, I can select multiple "skin" tones when I use a smiley face on my iPhone, but your Android device may only show a generic one.
Edit: The image edited into the original post is a perfect example. Using Chrome on Windows, I see a black check mark. The screenshot from Firefox shows green.
The symbols used here aren't ascii-encoded. They use the much more vast range of Unicode encoding. Ascii(extended) is restricted to a 256 symbol set.
The unicode interpretation for symbols/glyphs(small pictorial representation)(these ticks aren't characters), can vary for different platforms as some the range of unicode is open for usage and isn't set as global.
Which is why, while the unicode encryption remains the same for every device irrespective, the decryption is differently interpreted by different devices/online-platforms, allowing us to perceive either a coloured or a black symbol.
Take the character "ب". It shows in stack overflow. I can see this in a cshtml file and in a js file
The character "ُ" on the other hand shows here correctly. However it shows as a question mark in the cshtml file and js file. If I copy it to notepad it shows as a Ḍammah (a loop normally above a letter which indicates a 'u' sound)
Why is it a question mark in the cshtml file if notepad understands it? ALso Visual Studio understands other arabic characters so why not this one
All I can think of is that a Dammah (as far as I know) always sits above another letter so can't be used in isolation?
What I'm trying to do is detect words that have a Dammah in them via Javascript
I'm completely new to unicode and non acii characters so this may be a stupid question, apologies if so
It often happens that an application uses a default font that does not support all the unicode characters that a usecase requires. In such a case, try to change the font to a more compatible one. "Courier New" works mostly well, also "Arial Unicode MS" does a good job. But there is no font that covers totally everything, so maybe you will need to switch between two or three fonts to cover all required characters. For Arabic, "Arial" is a good choice, but there are many interesting alternatives.
I'm trying to find a way to make dingbats appear exactly the same in Firefox, Chrome, Safari and IE.
I noticed that the Dingbats appear the same in IE/Chrome/Safari, HOWEVER - in Firefox - they look "thinner".
For example - try to visit the following page:
http://en.wikipedia.org/wiki/Dingbat
You'll notice that when viewing that page in Firefox - the characters look different in comparison to Chrome/IE.
Does anybody know why and how can I cause Firefox to display the characters EXACTLY like they appear in Chrome/IE?
I'm trying to find a way to make dingbats appear exactly the same
You will never make fonts look exactly the same in all browsers, whether the characters in question are Dingbats or not.
For me, most of the characters on that page don't render in IE or WebKit. IE traditionally has poorer font fallback than average and Firefox typically better then average. The font that Firefox and Opera manage to choose to render the symbols for me is Meiryo (a Japanese font installed with Windows Vista and later). On IE and WebKit it falls back to the much more limited selection of symbols available in Arial, leaving most of the characters missing.
So for the best chance of rendering symbol characters how you want, do as you do for any other characters, and specify the font you want, eg. CSS font-family: Meiryo. But of course anyone who doesn't have that font installed will get something different, and browser/OS settings may change how fonts are rendered in general.
The symbol characters from the Zapf Dingbats set are not safe to use on the web, as the basic default sets of fonts installed by operating systems do not typically include glyphs for most of them. (‘Wingdings’ on Windows does, but it's a legacy font with a custom mapping that puts the symbols on ASCII characters instead of where they should be in Unicode, so again it's not safe to use on the web.)
There are a few symbol characters that you can typically get away with using for commonly-available font sets, eg:
● ■ ☺ ☻ ♥ ♦ ♣ ♠ • ▲ ▼
others, I'd try to avoid.
Interesting...On a MacOS X 10.6.6 machine with Firefox 3.6.13 and Chrome 8.0.552.231, the Wikipedia pages do render the first table, the ITC Zapf Dingbats, slightly differently. The effect is most noticeable on the solid half-circle at the bottom left corner of the set of characters.
The main Unicode Dingbats table renders almost the same; Firefox generates boxes containing the 4 hex digits of the missing character for the missing symbols, but Chrome just generates empty boxes - I prefer Firefox's technique.
The browsers must either be using slightly different fonts or slightly different font sizes (though I can't detect a size difference by eye). I've not looked at the HTML that is being rendered.
On the whole, I think that this is within the realm of 'allowable variation' - but I'm not an expert. I suspect you have a world of worry ahead of you if you demand pixel-for-pixel similarity between browsers. The concern should be to get the message across clearly.