The Cambria Math font has UNICODE characters beyond 0xFFFF. You can see them in a Word document, just by inserting a Symbol and selecting the Cambria Math font. By the way, the Windows Character Map does not show these characters. My question is : how to exhibit those UNICODE characters in a Windows app using TextOut() ?
To display these supplementary code points you need to use UTF-16 surrogate pairs.
A surrogate pair is a way of representing single code points beyond 0xFFFF as two wide characters. You simply pass a surrogate pair to TextOut() and it will be displayed.
Related
I am attempting to render a series of UNICODE characters onto a spritesheet. This all works quite well for most characters, including Cyrillic ones.
When using GetCharABCWidthsFloat with certain CJKV characters however, the ABCFLOAT::abcfB parameter provides a value lower than expected. It does not account for underhangs or overhangs, which is the exact purpose of the ABCs:
The B spacing is the width of the drawn portion of the character glyph.
Source: ABCFLOAT | Microsoft Docs
As you can see, all characters do not overlap left-to-right, except the last few characters:
I get around this by creating a customizable padding option, to handle such cases, but this bloats the rest of the glyphs and thus requires a larger surface:
Font being used is Arial. For the character 美, ABC returns (2, 10, 2), which sums to a advance of 14 pixels, when in fact, 17 pixels are needed.
I use TextOut to actually render the glyphs, but I do wonder if there is someone out there who's experienced this and came up with a universal solution.
Using functions like GetTextExtentPoint32W or DrawTextEx to get the rectangle does not allow precise per-character placement, which is the whole point of the ABC. And some unmentioned functions only work with TrueType fonts.
I question if certain characters shift to a different font under certain conditions, causing the results to be inaccurate. If that is the case, is there a way to determine if a character is not available for a font, knowing what Windows does automatically so I can reproduce the behaviour? That is, is there some sort of way to determine when a character should fall back on another font, and a way to determine what that font should be?
I have been on this problem for quite some time, so anyone with experience with these APIs would be greatly welcomed!
From the documentation on GetCharABCWidthsFloat:
The ABC widths of the default character are used for characters outside the range of the currently selected font.
Arial contains a lot of characters, including Cyrillic, but it does not contain CJKV ideographs. Other text-related calls may give you the false impression that it does have those characters (through a default/fallback font mechanism).
Before using (maybe before getting) the ABCFLOAT, you should first check that the characters you want metrics for are within the range of the currently selected font.
I have strings that are mostly standard alpha-numeric and other printable ASCII characters, and would like to display these in a console window on Windows. What I'm looking for is either a console font or a unicode set of characters that represent the numbers 0-255 (0-FF) using a single glyph for each. The thing that comes closest that I am aware of is that unicode has a small set of circled numbers, 1-20, and elsewhere numbers 21-50. Something along those lines, but for 0-255 (or 0-FF) is what I'm trying to find.
It seems to me that this would be a relatively common need/desire, but I've been unable to track down a solution. Any help appreciated!
The C0 and C1 controls can be represented by control pictures. The rest of the C0 Controls and Basic Latin, and C1 Controls and Latin-1 Supplement blocks can be represented by themselves. You may have to test a few fonts to find one that supports all these characters.
However, you said "ASCII" and "0-255". But, ASCII has only 128 codepoints. Your codepoints 128-255 must be from an unnamed character set. Although you probably mean one of the well-known ones, they are so numerous that a detailed answer isn't practical.
There is also the Unicode BMP Fallback SIL font that covers U+0000 to U+FFFF (but not U+10000 to U+10FFFF).
GetTextExtent32 returns different character width ratios (e.g., width of '9' versus space) than Word or Acrobat use when displaying the same font (e.g., 10-point Arial).
This matters because I'm trying to prepare clipboard strings that will get pasted into apps that don't support much formatting (no tabs or tables), but I still need to align certain columns of info. I'm trying to overcome this challenge by dynamically calculating the number of spaces I need to insert (remember, no tabs allowed!).
For example, calling GetTextExtent32 with a selected font of Arial 10-point gives a logical unit width of 7 for the digit '9', and a logical unit width of 4 for a space. This ratio proves correct when using something like DrawText.
However, when I export strings to Word or Acrobat, it turns out that 2 spaces in this example font exactly equals the width of one 9 (whether looking at a single 9, or nine contiguous 9s). I don't know much about fonts, but it doesn't appear to be any kind of juxtaposition issue; GetCharABCWidths shows 0 for both the a and c widths.
Does anyone know why Word and Acrobat are not showing the same proportions/measurements for a given font as Windows itself? Is there are a way to calculate this?
What's the differences between GlyphRange and CharacterRange in NSTextView?
- (NSRectArray)rectArrayForCharacterRange:(NSRange)charRange withinSelectedCharacterRange:(NSRange)selCharRange inTextContainer:(NSTextContainer *)container rectCount:(NSUInteger *)rectCount;
- (NSRectArray)rectArrayForGlyphRange:(NSRange)glyphRange withinSelectedGlyphRange:(NSRange)selGlyphRange inTextContainer:(NSTextContainer *)container rectCount:(NSUInteger *)rectCount;
I think the glyph and char always have same range.
In usage I think glyphRange and charRange are same, because when I use the two method, I give charRange and glyphRange a same range value, the output NSRectArray is the same.
Am I misunderstanding?
There isn't always a one-to-one correspondence between glyphs and characters. Ligatures are examples of glyphs which represent two characters.
For this reason GlyphRange and CharacterRange may not be the same. The character string "fi" contains two characters, but the layout manager may decide to display it as a single glyph (using the fi ligature), and so the glyph range may have length of 1.
Remember, glyphs are the graphical representations of characters. Think of a character like a as an abstract entity which is the same regardless of typeface or typestyle. As a character, a Times New Roman Italic a is just as much of an a as a Comic Sans Bold a. This distinction in visual appearance is only made at the glyph level.
-rectArrayForCharacterRange:
Returns an array of rectangles and, by reference, the number of such rectangles, that define the region in the given container enclosing the given character range.
Reference about CharacterRange.
-rectArrayForGlyphRange:
Returns an array of rectangles and, by reference, the number of such rectangles, that define the region in the given container enclosing the given glyph range.
Reference about GlyphRange.
Given that the Windows API function GetGlyphIndices() can translate a 2 byte UNICODE char code into a glyph index, I intend to hardcode those glyph indices, instead of the UNICODE points. Is that possible ?
I understand that MS could later change the value returned by this function for one particular UNICODE point, but it's my expectation that the current glyph index will be maintained in the glyph set, in that situation.
In other words, my understanding is that if MS decides to associate a new glyph index with a UNICODE point, it will enlarge the glyph set keeping the old glyphs.
Could someone confirm this ?
There is no guarantee that new glyphs will always be appended. (And what if a glyph gets deleted?)