view hindi (Arabic) numbers in Firefox - firefox

When viewing Arabic pages in Firefox, all the numbers are still Latin.. unlike in IE for example, where it shows the Arabic numerals (hindi) well for Arabic pages, and Latin for other pages..
I found a tweak online through about:config (bidi.numeral), but it changes all the pages for all the languages to one type of numbers only!
I am looking for a fix that lets me view the numbers depending on the page.. not fixed..

bidi.numeral works as described in the documentation. If you set it to 2, you will get Arabic numerals in Arabic contexts, meaning that if some digits occur in the middle of Arabic text (or any other right to left language I think), they will be rendered as Arabic digits. If the digits are at the beginning of the text or in the middle of none Arabic text, they will be rendered as the default digits [0-9].
If you set it to 3, all the digits in the browser will be rendered as Arabic numerals.

Related

Arabic letter noon ghunna incorrectly displayed with a dot

Background
The Arabic letter noon ghunna (ں) is displayed incorrectly on my Windows 10 PC (in Chrome, Edge, Notepad and Word). The sequence ALEF, NOON GHUNNA, ALEF is displayed as:
The same sequence is displayed correctly on my Android phone without the dot:
For completeness, the actual unicode string (for copy/paste purposes) is:
اںا
There has been some controversy regarding this letter (L2-12/381) which has settled by now as seen from the Unicode Standard which states (since version 7 and up to the current 11):
Rendering systems should display U+06BA as a dual-joining letter, with all four contextual forms shown dotless, regardless of the language of the text.
But the dot appears in word-initial (ںا) and mid-word (اںا) positions. Final (اں) and isolated (ں) forms are fine.
Question
Now my question is, how can this be fixed, other than by waiting for Microsoft to fix it? I want to understand where the problem lies. Is it in the Uniscribe library, or is it down to the font being used? Can it be fixed by using a specifically crafted TrueType/OpenType font?
This turned out to be a font problem. Quite a few fonts on fonts.google.com show this letter correctly:
https://fonts.google.com/?subset=arabic&selection.family=Amiri|Aref+Ruqaa|Cairo|El+Messiri|Harmattan|Lemonada|Mada|Reem+Kufi|Scheherazade

Japanese Text is not show properly

My collegue wants to use my application in Japan. Therefore I tested the application in Windows 7 and changed the Regional Settings for "non-unicode"-Language to Japanese.
Now I have the problem, that some text are shown right with japanese characters. Other text is displayed with wrong characters (not japanese).
I tested also other languages like Chinese or Taiwanese and every text is right. I have only problems with Japanese.
Has anybody an idea what's wrong?
It is most likely because of the "Unihan" principle being applied by Unicode, which forced similarly-looking Japanese and Chinese letters to be encoded into same Unicode character despite their glyph being difference. Many computer systems would use Chinese font instead of Japanese font by default so the Chinese font would be displayed, hence it would result in wrong glyph. Changing the "non-unicode-language" into Japanese would not solve such a problem.
In order to display the proper Japanese glyph, you need to make sure that your application is going to use a Japanese font. Normally if your program is to run in an entirely Japanese environment which include changing the default language and font of the computer into Japanese then Japanese font would be used. On the other hand, you can also specify or embed a Japanese font for your application to make sure it would work as intended.

Arabic-English Transliteration using unsupported font

I am working on language transliteration for Ar and En text.
Here is the link which displays character by character replacement : https://github.com/Shnoulle/Ar-PHP/blob/master/Arabic/data/Transliteration.xml
Now issue is:
I am dealing with font style robert_bold.ttf and robert_regular_0.ttf which has some typical characters with underline and overline as in this snap
I have .ttf file so I can see this fonts on my system. But in my application or in above Transliteration.xml characters are considered as junk like [, } [ etc.
How can I add support of this unsupported characters in Transliteration.xml file?
<pair>
<search>ي</search>
<replace>y</replace>
</pair>
<pair>
<search>ى</search>
<replace>a</replace>
</pair>
<pair>
<search>أ</search>
<replace>^</replace> // Here is one of the character s_ (s with underscore not supported)
</pair>
It seems that the font is not Unicode encoded but contains the underlined letters at some arbitrarily assigned codes. While this works up to a point, it does not work across applications, of course. It works only when that specific font is used.
The proper way is to use correct Unicode characters such as U+1E0F LATIN SMALL LETTER D WITH LINE BELOW “ḏ” and, for rendering, try to find fonts containing it.
An alternative is to use just basic Latin letters with some markup, say <u>d</u>. This means that the text must not be treated as plain text in later processing, and in rendering, the markup should be interpreted as requesting for a line under the letter(s).

Does Google Chart support UTF-8 Characters?

I have title and labels with unicode labels in Google Chart, but they are not being displayed properly.
Here's an example: http://chart.apis.google.com/chart?chs=300x225&cht=p3&chco=344566,C4C4C4&chds=0,90&chma=70,70&choe=UTF-8&chtt=Test&chd=t:27933485,20611682,34172068&chl=Un%E9%A7%85xbr%E1%83%A6cker|Test1|Test2
Characters do not appear right as you see.
Is there a way to make google charts display utf-8 characters properly? I've tried many things but nothing worked for me.
The problem appears to be the unicode codepoints (E9A785 -> 99C5 and E183A6 -> 10E6) that you are providing. These characters do not appear to be displayed in a google chart. Experiments with other codepoints (specifying them as UTF-8 in the same format as your query) appear to work fine.
The particular characters in your example (the first is from the CJK Unified Ideograms and the second from Georgian) are a little strange. You might want to double check that they are correct.

How to get glyph unicode representation of Unicode character

Windows use uniscribe library to substitute arabic and indi typed characters based on their location. The new glyph is still have the original unicode of the typed character althogh it has its dedicated representation in Unicode
How to get the Unicode of what is actually displayed not what is typed.
There are lots of tools for this like ICU, Charmap and the rest. I myself recommand http://unicode.codeplex.com, it uses Unicode Character Database to represent characters.
Note that unicode is just some information about characters and never spoke about representation. They just suggest to implement a word just like their example. so that to view each code you need Standard Unicode Font like MS Arial Unicode whichis the largest and the best choise in windows platform.
Most of the characters are implemented in this font but for new characters you need an update for it (if there are such an update) or you can use the font which you know that it implemented your desire characters
Your interpretation of what is happening in Uniscribe is not correct.
Once you have glyphs the original information is gone there is no reliable way to go back to Unicode.
Even without going to Arabic, there is no way to distinguish if the glyph for the fi ligature (for example) comes from 'f' and 'i' (U+0066 U+0069) or from 'fi' (U+FB01).
(http://www.fileformat.info/info/unicode/char/fb01/index.htm)
Also, some of the resulting glyphs do not have a Unicode value associated with them, so there is no "Unicode of what is actually displayed"

Resources