Arabic letter noon ghunna incorrectly displayed with a dot - windows

Background
The Arabic letter noon ghunna (ں) is displayed incorrectly on my Windows 10 PC (in Chrome, Edge, Notepad and Word). The sequence ALEF, NOON GHUNNA, ALEF is displayed as:
The same sequence is displayed correctly on my Android phone without the dot:
For completeness, the actual unicode string (for copy/paste purposes) is:
اںا
There has been some controversy regarding this letter (L2-12/381) which has settled by now as seen from the Unicode Standard which states (since version 7 and up to the current 11):
Rendering systems should display U+06BA as a dual-joining letter, with all four contextual forms shown dotless, regardless of the language of the text.
But the dot appears in word-initial (ںا) and mid-word (اںا) positions. Final (اں) and isolated (ں) forms are fine.
Question
Now my question is, how can this be fixed, other than by waiting for Microsoft to fix it? I want to understand where the problem lies. Is it in the Uniscribe library, or is it down to the font being used? Can it be fixed by using a specifically crafted TrueType/OpenType font?

This turned out to be a font problem. Quite a few fonts on fonts.google.com show this letter correctly:
https://fonts.google.com/?subset=arabic&selection.family=Amiri|Aref+Ruqaa|Cairo|El+Messiri|Harmattan|Lemonada|Mada|Reem+Kufi|Scheherazade

Related

As a letter / character may have color? like this: ✔️

I have found this letter / character in facebook, but how can this have a color? is just insane for me, look this: ✔️
Added image (From Firefox on windows)
It's not an ASCII character, it's likely an emoji. Emoji are part of Unicode and the actual glyph displayed to the user is open to interpretation by the platform displaying it. The spec suggests a name/description, but the implementation varies.
So while you may see a colored check mark, I see black & white. Other times, a single glyph will have multiple styles made available on a particular platform; for example, I can select multiple "skin" tones when I use a smiley face on my iPhone, but your Android device may only show a generic one.
Edit: The image edited into the original post is a perfect example. Using Chrome on Windows, I see a black check mark. The screenshot from Firefox shows green.
The symbols used here aren't ascii-encoded. They use the much more vast range of Unicode encoding. Ascii(extended) is restricted to a 256 symbol set.
The unicode interpretation for symbols/glyphs(small pictorial representation)(these ticks aren't characters), can vary for different platforms as some the range of unicode is open for usage and isn't set as global.
Which is why, while the unicode encryption remains the same for every device irrespective, the decryption is differently interpreted by different devices/online-platforms, allowing us to perceive either a coloured or a black symbol.

Visual Studio, cshtml file, understanding how arabic characters are treated

Take the character "ب". It shows in stack overflow. I can see this in a cshtml file and in a js file
The character "ُ" on the other hand shows here correctly. However it shows as a question mark in the cshtml file and js file. If I copy it to notepad it shows as a Ḍammah (a loop normally above a letter which indicates a 'u' sound)
Why is it a question mark in the cshtml file if notepad understands it? ALso Visual Studio understands other arabic characters so why not this one
All I can think of is that a Dammah (as far as I know) always sits above another letter so can't be used in isolation?
What I'm trying to do is detect words that have a Dammah in them via Javascript
I'm completely new to unicode and non acii characters so this may be a stupid question, apologies if so
It often happens that an application uses a default font that does not support all the unicode characters that a usecase requires. In such a case, try to change the font to a more compatible one. "Courier New" works mostly well, also "Arial Unicode MS" does a good job. But there is no font that covers totally everything, so maybe you will need to switch between two or three fonts to cover all required characters. For Arabic, "Arial" is a good choice, but there are many interesting alternatives.

Which should not be included in this ZPL Code?

I am developing a Windows Mobile App that requires printing into a Zebra printer. Problem is, I do not have the printer with me here in my country since the client did not provide any.
My approach was to design a label first in ZebraDesigner2, then print out the label into a text file. Printing the label to a text file instead of a printer sends out the ZPL Code to produce the label I was trying to print. Hence, I can generate ZPL codes faster by designing a label first then seeing the ZPL code. Kinda like having a drag and drop GUI with a background XML.
Say that I have this simple label that contains this text:
Hello World!
If I print this in ZebraDesigner2 it would be written to my text file as:
CT~~CD,~CC^~CT~
^XA~TA000~JSN^LT0^MNW^MTT^PON^PMN^LH0,0^JMA^PR5,5~SD15^JUS^LRN^CI0^XZ
^XA
^MMT
^PW609
^LL0406
^LS0
^FT1,29^A0N,28,28^FH\^FDHello World!^FS
^PQ1,0,1,Y^XZ
My main question is, which one do I include in my C# Code if im going to send this code to the printer via my Windows Mobile C# app? Do I include the part with ^XA until ^XZ? I believe that CT~~CD,~CC^~CT~ should not be included in my code If im not mistaken.
Late answer, but since this is getting viewed...
The CT line and first set of XA..XZ sequence sets up the modes, label length, printable area, etc.
If you remove those, it will take those settings from the label/printer settings, which is usually what you want. The printers can sense the length and width of the label.
Leaving them in can cause big problems, because if you define the printable area in your label, and then the next label type submitted does not, it will use the settings you have defined -- which can cause blank areas in the label, eg. cutoff USPS Label barcodes that are printed after your Zebra Designer custom labels.
Found this out the hard way - leave those out, and you should leave out of the remaining XA..XZ sequence the MMT, PW609, LL0406, and LS0 as well - your Hello World will not be affected.
If you really want to limit the area printed to, set up margins inside the printable area, etc, refer to the manual.
you have to look at the programmers guide before you remove anything of the code. The CT~ command for example changes the control prefix.
Search the internet or zebra.com site for "ZPL Programming guide".
So, leave the text file as is and then include that into your windows mobile application.
PS: zebra offers SDKs for label/receipt printers: http://www.zebra.com/gb/en/products-services/software/adapt-software.html
PS2: without a test printer you may get bad final results.

Display of Asian characters (with Unicode): Difference in character spacing when presented in a RichEdit control compared with using ExtTextOut

This picture illustrates my predicament:
All of the characters appear to be the same size, but the space between them is different when presented in a RichEdit control compared with when I use ExtTextOut.
I would like to present the characters the same as in the RichEdit control (ideally), in order to preserve wrap positions.
Can anyone tell me:
a) Which is the more correct representation?
b) Why the RichEdit control displays the text with no gaps between the Asian Characters?
c) Is there any way to make ExtTextOut reproduce the behaviour of the RichEdit control when drawing these characters?
d) Would this be any different if I was working on an Asian version of Windows?
Perhaps I'm being optimistic, but if anyone has any hints to offer, I'd be very interested to hear.
In case it helps:
Here's my text:
快的棕色狐狸跳在懶惰狗1 2 3 4 5 6 7 8 9 0
apologies to Asian readers, this is merely for testing our Unicode implemetation and I don't even know what language the characters are taken from, let alone whether they mean anything
In order to view the effect by pasting these characters into a RichEdit control (eg. Wordpad), you may find you have to swipe them and set the font to 'Arial'.
The rich text that I obtain is:
{\rtf1\ansi\ansicpg1252\deff0\deflang2057{\fonttbl{\f0\fnil\fcharset0 Arial;}}{\colortbl ;\red0\green0\blue0;}\viewkind4\uc1\pard\sa200\sl276\slmult1\lang9\fs22\u24555?\u30340?\u26837?\u33394?\u29392?\u29432?\u36339?\u22312?\u25078?\u24816?\u29399?1 2 3 4 5 6 7 8 9 0\par\pard\'a3 $$ \'80\'80\cf1\lang2057\fs16\par}
It doesn't appear to contain a value for character 'pitch' which was my first thought.
I don't know the answer, but there are several things to suspect:
There are several versions of the rich edit control. Perhaps you're using an older one that doesn't have all the latest typographic improvements.
There are many styles and flags that affect the behavior of a rich editcontrol, so you might want to explore which ones are set and what they do. For example, look at EM_GETEDITSTYLE.
Many Asian fonts come in two versions on Windows. One is optimized for horizontal layout, and the other for vertical layout. That latter usually has the same name, but has # prepended to it. Perhaps you are using the wrong one in the rich edit control.
UPDATE: By messing around with Wordpad, I was able to reproduce the problem with the crowded text in the rich edit control.
Open a new document in Wordpad on Windows 7. Note that the selected font is Calibri.
Paste the sample text into the document.
Text appears correct, but Wordpad changed the font to SimSun.
Select the text and change the font back to Calibri or Arial.
The text will now be overcrowded, very similar to your example. Thus it appears the fundamental problem is with font linking and fallback. ExtTextOut is probably selecting an appropriate font for the script automatically. Your challenge is to figure out how to identify the right font for the script and set that font in the rich edit control.
This will only help with part of your problem, but there is a way to draw text to a DC that will look exactly the same as it does with RichEdit: what's called the windowless RichEdit control. It not exactly easy to use: I wrote a CodeProject article on it a few years back. I used this to solve the problem of a scrollable display of blocks of text, each one of which can be edited by clicking on it: the normal drawing is done with the windowless RichEdit, and the editing by showing a "real" RichEdit control on the top of it.
That would at least get you the text looking the same in both cases, though unfortunately both cases would show too little character spacing.
One further thought: if you could rely on Microsoft Office being installed, you could also try later versions of RichEdit that come with office. There's more about these on Murray Sargent's blog, as well as some interesting articles on font binding that might also help.
ExtTextOut allows you to specify the logical spacing between records. It has the parameter lpDx which is a const pointer to an array of values that indicate the distance between origins of adjacent character cells. The Microsoft API documentation notes that if you don't set it, then it sets it's own default spacing. I would have to say that's why ExtTextOut is working fine.
In particular, when you construct a EMR_EXTTEXTOUTW record in EMF, it populates an EMR_TEXT structure with this DX array - which looking at one of your comments, allowed the RichEdit to insert the EMF with the information contained in the record, whereby if you didn't set a font binding then the RTF record does some matching to work out what font to use.
In terms of the RichEdit control, the following article might be useful:
Use Font Binding in a Rich Edit Control
After character sets are assigned, Rich Edit scans the text around the
insertion point forward and backward to find the nearest fonts that
have been used for the character sets. If no font is found for a
character set, Rich Edit uses the font chosen by the client for that
character set. If the client hasn't specified a font for the character
set, Rich Edit uses the default font for that character set. If the
client wants some other font, the client can always change it, but
this approach will work most of the time. The current default font
choices are based on the following table. Note that the default fonts
are set per-process, and there are separate lists for UI usage and for
non-UI usage.
If you haven't set the characterset, then it further explains that it falls back to ANSI_CHARSET. However, it's most definitely a lot more complicated than that, as that blog article by Murray Sargent (a programmer at Microsoft) shows.

Unicode special characters appear differently in Firefox vs. Chrome/IE

I'm trying to find a way to make dingbats appear exactly the same in Firefox, Chrome, Safari and IE.
I noticed that the Dingbats appear the same in IE/Chrome/Safari, HOWEVER - in Firefox - they look "thinner".
For example - try to visit the following page:
http://en.wikipedia.org/wiki/Dingbat
You'll notice that when viewing that page in Firefox - the characters look different in comparison to Chrome/IE.
Does anybody know why and how can I cause Firefox to display the characters EXACTLY like they appear in Chrome/IE?
I'm trying to find a way to make dingbats appear exactly the same
You will never make fonts look exactly the same in all browsers, whether the characters in question are Dingbats or not.
For me, most of the characters on that page don't render in IE or WebKit. IE traditionally has poorer font fallback than average and Firefox typically better then average. The font that Firefox and Opera manage to choose to render the symbols for me is Meiryo (a Japanese font installed with Windows Vista and later). On IE and WebKit it falls back to the much more limited selection of symbols available in Arial, leaving most of the characters missing.
So for the best chance of rendering symbol characters how you want, do as you do for any other characters, and specify the font you want, eg. CSS font-family: Meiryo. But of course anyone who doesn't have that font installed will get something different, and browser/OS settings may change how fonts are rendered in general.
The symbol characters from the Zapf Dingbats set are not safe to use on the web, as the basic default sets of fonts installed by operating systems do not typically include glyphs for most of them. (‘Wingdings’ on Windows does, but it's a legacy font with a custom mapping that puts the symbols on ASCII characters instead of where they should be in Unicode, so again it's not safe to use on the web.)
There are a few symbol characters that you can typically get away with using for commonly-available font sets, eg:
● ■ ☺ ☻ ♥ ♦ ♣ ♠ • ▲ ▼
others, I'd try to avoid.
Interesting...On a MacOS X 10.6.6 machine with Firefox 3.6.13 and Chrome 8.0.552.231, the Wikipedia pages do render the first table, the ITC Zapf Dingbats, slightly differently. The effect is most noticeable on the solid half-circle at the bottom left corner of the set of characters.
The main Unicode Dingbats table renders almost the same; Firefox generates boxes containing the 4 hex digits of the missing character for the missing symbols, but Chrome just generates empty boxes - I prefer Firefox's technique.
The browsers must either be using slightly different fonts or slightly different font sizes (though I can't detect a size difference by eye). I've not looked at the HTML that is being rendered.
On the whole, I think that this is within the realm of 'allowable variation' - but I'm not an expert. I suspect you have a world of worry ahead of you if you demand pixel-for-pixel similarity between browsers. The concern should be to get the message across clearly.

Resources