Cygwin displays error messages in Hebrew and garbled - makefile

I have been using Cygwin to build my Android library using the NDK's ndk-build script and Cygwin's make tool. It started giving me errors with a bunch of Latin non-English characters. When copying the text to Google, it was pasted as Hebrew (which I can read). Is there any way to force it to output errors in English? Any idea why this happens?

Check your environment variables for the correct locale. LANG or LC_MESSAGES are probably responsible. Set those to an English locale (in your profile to have that in future sessions as well) to get English error messages. Sorry, I'm a Windows person and know nearly nothing of Unix so you'd have to look up the specifics elsewhere, but this should be the general direction to go.
Some programs/libraries try to be overly smart by guessing the locale from the keyboard layout or the user's locale. And oftentimes ignoring the fact that on Windows locale and UI language are two different concepts (and that different languages on the console are even harder to get right).
As for why the messages appear garbled that's likely because the console window uses the wrong code page. The easiest fix is usually to use a TrueType font for the console window, but in this case neither Consolas nor Lucida Console include glyphs for Hebrew, so you'd only see boxes anyway.

Related

how does windows deal with drawing chars not in the current font

I have an app that is trying to display U+23CE (⏎). This is a terminal app, so we are using "Consolas"/"Cascadia"/"Courier". As far as I can see, none of these fonts have this character. And yet, in Visual Studio, when I am debugging this app, it actually displays it correctly in the debugger. Also, when displayed by the new Windows Terminal, it displays correctly. But when I use the app I am working with (actually Putty), it displays the "I don't know this character" glyph.
Putty is a classic Win32 app using ExtTextOutW() to draw that text. I have checked that the correct font is bound to the HDC.
I am assuming that Visual Studio and Windows Terminal are using DirectWrite or other more modern text output logic, but ultimately they have to be getting these unknown glyphs from somewhere.
UPDATE:
I found a font with that character ("Segue UI Symbol"), and if I set Putty to use that font, it displays the missing character (woohoo). Sadly, this is a proportional font, so it looks terrible, and this is not the solution.
#dvix pointed me at a Microsoft page discussing this exact topic, but its not clear which things are done by Windows and which by an app developer. I tried linking "Courier New" (Putty's default) to "Segoe Symbol"", but it made no difference. Does the Putty code need to do all the work itself? Detect an unknown character, read the Registry, and substitute the font for that one char? That is certainly doable, but a pain.
Windows can be directed to "borrow" missing glyphs in a font from another font that carries them using font linking. This applies to both consoles and GUI apps that use GDI (DrawText, ExtTextOut) to render text in Windows 2000 and later.
For example, the following registry entry will link the Consolas font to Segoe UI Symbol (the following can be saved as a .reg file and merged into the registry, will take effect at the next logon).
REGEDIT4
[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\FontLink\SystemLink]
"Consolas"=hex(7):53,45,47,55,49,53,59,4d,2e,54,54,46,2c,53,65,\
67,6f,65,20,55,49,20,53,79,6d,62,6f,6c,00,00
; "Consolas"=REG_MULTI_SZ:"SEGUISYM.TTF,Segoe UI Symbol"
One handy tool to explore coverage of the different fonts is BabelMap. For example this is the list of fonts that carry U+23CE (⏎) on a fairly clean Win10 system.
Another feature of BabelMap is the option to create temporary user-defined composite fonts on the fly, as opposed to the ones "statically" defined in the registry. This is presumably done using the MLang
IMLangFontLink interface, more about that in Raymond Chen's How to display a string without those ugly boxes and Michael Kaplan's Font substitution and linking #2.

Viewing Japanese MBCS text while remote debugging from English Windows machine?

Trying to debug a MBCS application that has had the strings, dialogs, etc. localized for Japanese. Seems to be a bug somewhere with a string getting truncated or something.
I am debugging from an English Windows 7 using Visual Studio 2013. Of course, since it is MBCS and not Unicode, when I view the strings, it is just gibberish. Probably, if it was unicode, then the strings would display in Japanese while remote debugging, but it's not, and it is not really an option.
So, is there any way to use some special encoding trick to view the string as Japanese on my English system. I'm not going to set my local system for remote debugging to Japanese either.
So... basically looking for some kind of option to view the Japanese strings from the remote system as Japanese strings on my English system. Anybody else been down this road?

Setting Default Keyboard Layout for Electron Application on Windows (Unable to copy UTF-8 text propely)

We have this problem where a Farsi UTF-8 text that is programmatically copied from an electron application, loses encoding when pasted into one specific application and is displayed as a set of ? characters. The issue persists even with manual text selection and copy command invoked via either context menu or Ctrl+C. Texts copied from other sources such as browsers or text editors are transferred just fine.
We tried clipboard API of electron. We also implemented our own helper to verify the issue is not with the clipboard itself.
We also prepended the text with UTF-8 BOM character before writing it to the clipboard.
One interesting observation was that once the text is pasted into some text editor and then recopied, target app received the text properly. We also noticed that changing the keyboard layout to target language when the electron app is focused, resolves the issue as well. In addition, we realized that Windows changes default keyboard layout to English when the application is launched.
Following on these clues, we configured NSIS bundler to set the default language to Persian so that maybe Windows detects it as default keyboard language as well. Description of the application shows Persian as the language but Windows does not respect it and reverts the language to system default upon launch.
We tried running a script on application startup to mimic a Farsi character keyboard input, creating temporary input fields, and a set of other hacks to maybe trick the Windows/application into properly handling these texts. Keep in mind we can't rely on user to perform idiotic actions on every application launch, to fix a problem that shouldn't exist in first place. That's why we need this issue to be resolved programmatically.
Right now the only solution that comes to mind is to force Windows to set the keyboard layout for our application to Persian via registry entries by some separate script that user need to run only once, or can be run after each installation. I'm not familiar with the windows registry entries. My searches came up empty and the results were focused on how to do it for the whole system, but we don't wanna mess with their whole system configurations since.
Any other suggestions regarding this issue is highly appreciated.
Other information that you might find relevant:
OS is Windows 7.
Target app is an accounting software and the vendor rejects to provide any support with the integration, so I have very little information about inner workings of that application.
html lang attribute of electron template is set to fa.
meta charset attribute is set to utf-8.
Application is bundled with electron-builder and NSIS.

Why some software can display all characters and some not?

Reference text: どうもありがとうございました
Copied to:
Notepad/Notepad++: displays it with no problems
LibreOffice Writer: it changes the font family to work, if you convert to Lucida Console, square boxes appear
Windows: displays it with no problems
Console: it needs the correct chcp and a font family (Lucida Console displays square boxes here too) which can display them if I am right
Is it possible to explain why Notepad can display any text in any font family and LibreOffice + Console cannot? Where is(are) the difference(s)? Is it possible to have the same behaviour on the console as the Notepad does for example?
Some Windows fonts have glyphs for many different scripts, some cover a few scripts, and many cover just one. (Fonts which support many scripts are sometimes called "Unicode fonts," which can be a misleading term. In other OSes, these kinds of fonts are more prevalent. Windows itself doesn't ship with any, though I think you get one or two with the Office suite.)
When you try to output text in multiple scripts using standard Windows functions using one of the well-known fonts, then Windows uses font fallback and/or font linking, which automatically switches between fonts as needed to output the whole string. Most programs, like Notepad and Notepad++, thus get coverage automatically.
I haven't read the LibreOffice code, but I suspect that when you select a font for a span of text, it sticks with that font, effectively preventing Windows's font fallback and font linking mechanisms from helping. This isn't surprising, since a WYSIWYG editor is likely to use lower-level APIs for outputting text in order to have more typographic control. But using the lower-level APIs means you don't get fallback and linking for free, so you'd have to implement it yourself, and that's a lot of extra work that may not be important to very many users.
The Windows console has a lot of legacy and limitations that persist for backward compatibility with older programs. The console mostly emulates DOS systems, which didn't have any sort of Unicode support and instead relied on "Code Pages," which are, roughly speaking, alternate mappings between character values and glyphs. Code Pages are geared at just one (or maybe two) scripts, so if you need characters from another script, you were basically out of luck. I think modern versions of Windows have hacked in some support for a pseudo code page that supports UTF-8, but I've never gotten it to work well and it, too, has limitations.

ncurses support for italics?

Some terminals, such as urxvt, support display text in italics via the sitm and ritm terminfo entries:
echo `tput sitm`italics`tput ritm`
I'd like to use this in an application I've got which wants to render real italics into the console. Unfortunately the application is ncurses-based, and ncurses doesn't seem to have a attribute for italics --- it's got a whole bunch, including invisible text (which I'm sure is useful for something), but no italics.
Does anyone know of a way to trick ncurses into displaying italic text, or am I going to have to ditch ncurses and rewrite the program to use raw terminal sequences?
It looks like ncurses 5.10 will contain A_ITALIC. The change went in on September 31 2013:
http://invisible-island.net/ncurses/NEWS-contents.html#t20130831
pdcurses supports A_ITALIC as well so there's at least a vague nod to compatibility. Unfortunately, this won't help me much until 5.10 is released and then becomes widespread...

Resources