Terminal control sequence for character display width? - terminal

Correct display of Unicode in a terminal would appear to benefit from the displaying app knowing the number of character cells used to display text. Functions like wcwidth() are a reasonable start, but there can be a lot of variation, for example what a terminal displays for invalid characters, ambiguous width Asian characters, combining characters out of context, etc.
Would it be reasonable to extend terminal apps with a new control sequence to measure with display width of a string, which display apps could use to characterize the terminal? If so, what details are worth considering, e.g. what sequence to use, whether to specify UTF-8, also how to handle terminals that do not know this hypothetical new control sequence? Would it have any likelihood of wide adoption?
If not, what is the flaw in the idea? Is perhaps reading the cursor position after display a better (and already supported) option? Or is there a good different approach?

There's no need, because the existing cursor position report (which can be used to get the position before and after printing a string) gives the length.
Adding a new control sequence to get the attributes of a character (width, combining, controls such as tab) wouldn't help much because the application still has to work with the system's locale information for performance reasons: it would drastically slow down an application if it had to ask after each character where the cursor really was.

Related

Low Level Text Output

I have not done any MACOS programming in several years. However, I now need to test out some algorithms for formatting text as paragraphs. For these tests, I need to be able to display characters with absolute position control.
These tests need to determine where line breaks are to go within a paragraph and the amount of work spacing, while taking into account hyphenation, ligatures, page breaks, kerning and the like.
When I last did such things, I was using the text functions of the CGContext class. Upon returning from my hiatus, I find all of these function are marked "deprecated" in the documentation—which conveniently leaves out what the current acceptable replacement is.
What is the current, approved process for drawing single glyphs on Mac OS?
Functions that deal with formatting strings would defeat the purpose of the academic experiment here.

As a letter / character may have color? like this: ✔️

I have found this letter / character in facebook, but how can this have a color? is just insane for me, look this: ✔️
Added image (From Firefox on windows)
It's not an ASCII character, it's likely an emoji. Emoji are part of Unicode and the actual glyph displayed to the user is open to interpretation by the platform displaying it. The spec suggests a name/description, but the implementation varies.
So while you may see a colored check mark, I see black & white. Other times, a single glyph will have multiple styles made available on a particular platform; for example, I can select multiple "skin" tones when I use a smiley face on my iPhone, but your Android device may only show a generic one.
Edit: The image edited into the original post is a perfect example. Using Chrome on Windows, I see a black check mark. The screenshot from Firefox shows green.
The symbols used here aren't ascii-encoded. They use the much more vast range of Unicode encoding. Ascii(extended) is restricted to a 256 symbol set.
The unicode interpretation for symbols/glyphs(small pictorial representation)(these ticks aren't characters), can vary for different platforms as some the range of unicode is open for usage and isn't set as global.
Which is why, while the unicode encryption remains the same for every device irrespective, the decryption is differently interpreted by different devices/online-platforms, allowing us to perceive either a coloured or a black symbol.

How do you change the letter-spacing/tracking in core text?

This could probably also be asked as "Is kCTKernAttributeName a misnomer?"
I need to change the letter spacing/tracking of some text in iOS. (The font I'm using is a little too tight at small sizes.) There are core graphics routines that will change character spacing, but those routines don't handle Unicode. There are other core graphics routines that are defined in terms of glyphs but those seem like a world of hurt, among other things, not having the safety net of reverting back to system fonts for glyphs that don't exist in my font.
So core text seems like the way to do this and core text supports kCTKernAttributeName on CFAttributedString. I think this will do what I want, though this really isn't kerning since kerning is a generally a character-pair attribute and this (appears to be, from the docs) just a uniform adjustment to the glyph advance for all glyphs, i.e., tracking.
If anyone knows before I go down the rather painful path of converting to the core text API ...
kCTKernAttribute name should do what you want. Setting it over a range of text adjusts the inter-glyph spacing consistently, irrespective of the specific glyphs.
I think part of the problem is that kerning seems to have been a virtual synonym of tracking (it's still just "adjust the spacing between (letters or characters) in a piece of text to be printed" in the dictionary that comes with OS X), and is now adopting just the meaning of pair kerning because of the redundancy. Probably an etymologist would be better placed to comment on that side of things...

Display of Asian characters (with Unicode): Difference in character spacing when presented in a RichEdit control compared with using ExtTextOut

This picture illustrates my predicament:
All of the characters appear to be the same size, but the space between them is different when presented in a RichEdit control compared with when I use ExtTextOut.
I would like to present the characters the same as in the RichEdit control (ideally), in order to preserve wrap positions.
Can anyone tell me:
a) Which is the more correct representation?
b) Why the RichEdit control displays the text with no gaps between the Asian Characters?
c) Is there any way to make ExtTextOut reproduce the behaviour of the RichEdit control when drawing these characters?
d) Would this be any different if I was working on an Asian version of Windows?
Perhaps I'm being optimistic, but if anyone has any hints to offer, I'd be very interested to hear.
In case it helps:
Here's my text:
快的棕色狐狸跳在懶惰狗1 2 3 4 5 6 7 8 9 0
apologies to Asian readers, this is merely for testing our Unicode implemetation and I don't even know what language the characters are taken from, let alone whether they mean anything
In order to view the effect by pasting these characters into a RichEdit control (eg. Wordpad), you may find you have to swipe them and set the font to 'Arial'.
The rich text that I obtain is:
{\rtf1\ansi\ansicpg1252\deff0\deflang2057{\fonttbl{\f0\fnil\fcharset0 Arial;}}{\colortbl ;\red0\green0\blue0;}\viewkind4\uc1\pard\sa200\sl276\slmult1\lang9\fs22\u24555?\u30340?\u26837?\u33394?\u29392?\u29432?\u36339?\u22312?\u25078?\u24816?\u29399?1 2 3 4 5 6 7 8 9 0\par\pard\'a3 $$ \'80\'80\cf1\lang2057\fs16\par}
It doesn't appear to contain a value for character 'pitch' which was my first thought.
I don't know the answer, but there are several things to suspect:
There are several versions of the rich edit control. Perhaps you're using an older one that doesn't have all the latest typographic improvements.
There are many styles and flags that affect the behavior of a rich editcontrol, so you might want to explore which ones are set and what they do. For example, look at EM_GETEDITSTYLE.
Many Asian fonts come in two versions on Windows. One is optimized for horizontal layout, and the other for vertical layout. That latter usually has the same name, but has # prepended to it. Perhaps you are using the wrong one in the rich edit control.
UPDATE: By messing around with Wordpad, I was able to reproduce the problem with the crowded text in the rich edit control.
Open a new document in Wordpad on Windows 7. Note that the selected font is Calibri.
Paste the sample text into the document.
Text appears correct, but Wordpad changed the font to SimSun.
Select the text and change the font back to Calibri or Arial.
The text will now be overcrowded, very similar to your example. Thus it appears the fundamental problem is with font linking and fallback. ExtTextOut is probably selecting an appropriate font for the script automatically. Your challenge is to figure out how to identify the right font for the script and set that font in the rich edit control.
This will only help with part of your problem, but there is a way to draw text to a DC that will look exactly the same as it does with RichEdit: what's called the windowless RichEdit control. It not exactly easy to use: I wrote a CodeProject article on it a few years back. I used this to solve the problem of a scrollable display of blocks of text, each one of which can be edited by clicking on it: the normal drawing is done with the windowless RichEdit, and the editing by showing a "real" RichEdit control on the top of it.
That would at least get you the text looking the same in both cases, though unfortunately both cases would show too little character spacing.
One further thought: if you could rely on Microsoft Office being installed, you could also try later versions of RichEdit that come with office. There's more about these on Murray Sargent's blog, as well as some interesting articles on font binding that might also help.
ExtTextOut allows you to specify the logical spacing between records. It has the parameter lpDx which is a const pointer to an array of values that indicate the distance between origins of adjacent character cells. The Microsoft API documentation notes that if you don't set it, then it sets it's own default spacing. I would have to say that's why ExtTextOut is working fine.
In particular, when you construct a EMR_EXTTEXTOUTW record in EMF, it populates an EMR_TEXT structure with this DX array - which looking at one of your comments, allowed the RichEdit to insert the EMF with the information contained in the record, whereby if you didn't set a font binding then the RTF record does some matching to work out what font to use.
In terms of the RichEdit control, the following article might be useful:
Use Font Binding in a Rich Edit Control
After character sets are assigned, Rich Edit scans the text around the
insertion point forward and backward to find the nearest fonts that
have been used for the character sets. If no font is found for a
character set, Rich Edit uses the font chosen by the client for that
character set. If the client hasn't specified a font for the character
set, Rich Edit uses the default font for that character set. If the
client wants some other font, the client can always change it, but
this approach will work most of the time. The current default font
choices are based on the following table. Note that the default fonts
are set per-process, and there are separate lists for UI usage and for
non-UI usage.
If you haven't set the characterset, then it further explains that it falls back to ANSI_CHARSET. However, it's most definitely a lot more complicated than that, as that blog article by Murray Sargent (a programmer at Microsoft) shows.

How can I change the background color of specific characters in a RTF document?

I'm trying to output RTF (Rich Text Format) from a Ruby program - and I'd prefer to just emit RTF directly without using the RTF gem as I'm doing pretty simple stuff.
I would like to highlight specific characters in a DNA sequence alignment and from the docs it seems that I can either use \highlightN ... \highlight0 or \cbN ... \cb1
The problem is that I cannot get \cb to work in either Word:Mac 2008 or Mac TextEdit (\cf works fine so I know it's not a color table issue)
\highlight does work but seemingly only with two of the possible colors (black and red) and \highlight does not use the custom color table.
By creating simple docs in Word with character shading and saving as RTF I can see blocks of ridiculously verbose RTF code that presumably does what I want, but it is so impenetrable that I'm not seeing the wood for the trees.
Part of the problem may well be that Mac Word is just not implementing RTF properly. I don't have a Windows version of Word handy.
Anyone know the right way to shade blocks of text?
Thanks
--Rob
There is a note in the RTF Pocket Guide that says MS Word does not implement the \cb command. It says MS Word uses \chshdng0\chcbpatN (where "N" is the color number that you would use with \cb). The book recommends using something like the following for compatibility with programs that implement \cbN and/or \chshdng0\chcbpatN: {\chshdng0\chcbpat5\cb5 text}.
Note: The copy of the book I have was published in 2003, so it might be a bit out-of-date.
The sequence of RTF commands that seems to be most universally supported by RTF-capable applications is:
\chshdng10000\chcbpatN\chcfpatN\cbN
These commands:
set the shading to 100 percent
set the pattern foreground and background colors to the color from the color table (we're not actually specifying a shading pattern)
set the character background to the color from the color table
Word was the most difficult application to properly render background colors in:
Despite what the latest (1.9.1) RTF spec says, Word 2013 does not resolve \highlightN colors from the \colortbl. Instead, \highlightN maps to a predefined list of colors. It looks like those colors come from the 1.5 version of the RTF spec.
Regarding \cb, the 1.9.1 spec contains this helpful pointer at the end of the section on Color Table:
Note: Windows versions of Word have never supported \cbN, but it can be emulated by the control word sequence \chshdng0\chcbpatN.
This is almost a useful suggestion, except that if you read the documentation for \chshdngN:
Character shading. The N argument is a value representing the shading of the text in hundredths of a percent.
So, 0 turns out to not be a very useful value; 100 / 0.01 gives us the 10000 we used in the sequence above.
Use WordPad to create RTF documents, not Word. WordPad creates much simpler documents, i.e. approaching human-readable.
I use WordPad every time I need to display formatted text in a WinForms application, and need something that the RichTextBox control can handle being assigned to its Rtf parameter.

Resources