ToAscii/ToUnicode in a keyboard hook destroys dead keys - windows

It seems that if you call ToAscii() or ToUnicode() while in a global WH_KEYBOARD_LL hook, and a dead-key is pressed, it will be 'destroyed'.
For example, say you've configured your input language in Windows as Spanish, and you want to type an accented letter á in a program. Normally, you'd press the single-quote key (the dead key), then the letter "a", and then on the screen an accented á would be displayed, as expected.
But this doesn't work if you call ToAscii() or ToUnicode() in a low-level keyboard hook function. It seems that the dead key is destroyed, and so no accented letter á shows up on screen. Removing a call to the above functions resolves the issue... but unfortunately, I need to be able to call those functions.
I Googled for a while, and while a lot of people seemed to have this issue, no good solution was provided.
Any help would be much appreciated!
EDIT: I'm calling ToAscii() to convert the virtual-key code and scan code received in my LowLevelKeyboardProc hook function into the resulting character that will be displayed on screen for the user.
I tried MapVirtualKey(kbHookData->vkCode, 2), but this isn't as "complete" a function as ToAscii(); for example, if you press Shift + 2, you'll get '2', not '#' (or whatever Shift + 2 will produce for the user's keyboard layout/language).
ToAscii() is perfect... until a dead-key is pressed.
EDIT2: Here's the hook function, with irrelevant info removed:
LRESULT CALLBACK keyboard_LL_hook_func(int code, WPARAM wParam, LPARAM lParam) {
LPKBDLLHOOKSTRUCT kbHookData = (LPKBDLLHOOKSTRUCT)lParam;
BYTE keyboard_state[256];
if (code < 0) {
return CallNextHookEx(keyHook, code, wParam, lParam);
}
WORD wCharacter = 0;
GetKeyboardState(&keyboard_state);
int ta = ToAscii((UINT)kbHookData->vkCode, kbHookData->scanCode,
keyboard_state, &wCharacter, 0);
/* If ta == -1, a dead-key was pressed. The dead-key will be "destroyed"
* and you'll no longer be able to create any accented characters. Remove
* the call to ToAscii() above, and you can then create accented characters. */
return CallNextHookEx(keyHook, code, wParam, lParam);
}

Quite an old thread. Unfortunately it didn't contain the answer I was looking for and none of the answers seemed to work properly. I finally solved the problem by checking the MSB of the MapVirtualKey function, before calling ToUnicode / ToAscii. Seems to be working like a charm:
if(!(MapVirtualKey(kbHookData->vkCode, MAPVK_VK_TO_CHAR)>>(sizeof(UINT)*8-1) & 1)) {
ToAscii((UINT)kbHookData->vkCode, kbHookData->scanCode,
keyboard_state, &wCharacter, 0);
}
Quoting MSDN on the return value of MapVirtualKey, if MAPVK_VK_TO_CHAR is used:
[...] Dead keys (diacritics) are indicated by setting the top bit of the return value. [...]

stop using ToAscii() and use ToUncode()
remember that ToUnicode may return you nothing on dead keys - this is why they are called dead keys.
Any key will have a scancode or a virtual key code but not necessary a character.
You shouldn't combine the buttons with characters - assuming that any key/button has a text representation (Unicode) is wrong.
So:
for input text use the characters reported by Windows
for checking button pressed (ex. games) use scancodes or virtual keys (probably virtual keys are better).
for keyboard shortcuts use virtual key codes.

Call 'ToAscii' function twice for a correct processing of dead-key, like in:
int ta = ToAscii((UINT)kbHookData->vkCode, kbHookData->scanCode,
keyboard_state, &wCharacter, 0);
int ta = ToAscii((UINT)kbHookData->vkCode, kbHookData->scanCode,
keyboard_state, &wCharacter, 0);
If (ta == -1)
...

Calling the ToAscii or ToUnicode twice is the answer.
I found this and converted it for Delphi, and it works!
cnt:=ToUnicode(VirtualKey, KeyStroke, KeyState, chars, 2, 0);
cnt:=ToUnicode(VirtualKey, KeyStroke, KeyState, chars, 2, 0); //yes call it twice

I encountered this issue while creating a key logger in C# and none of the above answers worked for me.
After a deep blog searching, I stumbled across this keyboard listener which handles dead keys perfectly.

Here is a full code which covers dead keys and shortcut keys using ALT + NUMPAD, basically a full implementation of a TextField input handling:
[DllImport("user32.dll")]
public static extern int ToUnicode(uint virtualKeyCode, uint scanCode, byte[] keyboardState, [Out, MarshalAs(UnmanagedType.LPWStr, SizeConst = 64)] StringBuilder receivingBuffer, int bufferSize, uint flags);
private StringBuilder _pressCharBuffer = new StringBuilder(256);
private byte[] _pressCharKeyboardState = new byte[256];
public bool PreFilterMessage(ref Message m)
{
var handled = false;
if (m.Msg == 0x0100 || m.Msg == 0x0102)
{
bool isShiftPressed = (ModifierKeys & Keys.Shift) != 0;
bool isControlPressed = (ModifierKeys & Keys.Control) != 0;
bool isAltPressed = (ModifierKeys & Keys.Alt) != 0;
bool isAltGrPressed = (ModifierKeys & Keys.RMenu) != 0;
for (int i = 0; i < 256; i++)
_pressCharKeyboardState[i] = 0;
if (isShiftPressed)
_pressCharKeyboardState[(int)Keys.ShiftKey] = 0xff;
if (isAltGrPressed)
{
_pressCharKeyboardState[(int)Keys.ControlKey] = 0xff;
_pressCharKeyboardState[(int)Keys.Menu] = 0xff;
}
if (Control.IsKeyLocked(Keys.CapsLock))
_pressCharKeyboardState[(int)Keys.CapsLock] = 0xff;
Char chr = (Char)0;
int ret = ToUnicode((uint)m.WParam.ToInt32(), 0, _pressCharKeyboardState, _pressCharBuffer, 256, 0);
if (ret == 0)
chr = Char.ConvertFromUtf32(m.WParam.ToInt32())[0];
if (ret == -1)
ToUnicode((uint)m.WParam.ToInt32(), 0, _pressCharKeyboardState, _pressCharBuffer, 256, 0);
else if (_pressCharBuffer.Length > 0)
chr = _pressCharBuffer[0];
if (m.Msg == 0x0102 && Char.IsWhiteSpace(chr))
chr = (Char)0;
if (ret >= 0 && chr > 0)
{
//DO YOUR STUFF using either "chr" as special key (UP, DOWN, etc..)
//either _pressCharBuffer.ToString()(can contain more than one character if dead key was pressed before)
//and don't forget to set the "handled" to true, so nobody else can use the message afterwards
}
}
return handled;
}

It is known that ToUnicode() and its older counterpart ToAscii() can change keyboard state of the current thread and thus mess with dead keys and ALT+NUMPAD keystrokes:
As ToUnicodeEx translates the virtual-key code, it also changes the
state of the kernel-mode keyboard buffer. This state-change affects
dead keys, ligatures, alt+numpad key entry, and so on. It might also
cause undesired side-effects if used in conjunction with
TranslateMessage (which also changes the state of the kernel-mode
keyboard buffer).
To avoid that you can do your ToUnicode() call in a separate thread (it will have a separate keyboard state) or use a special flag in wFlags param that is documented in ToUnicode() docs:
If bit 2 is set, keyboard state is not changed (Windows 10, version
1607 and newer)
Or you can prepare sc->char mapping table beforehand and update it on language change event.
I think it should work with ToAscii() too but better not use this old ANSI codepage-dependant method. Use ToUnicode() API instead that can even return ligatures and UTF-16 surrogate pairs - if keyboard layout have them. Some do.
See Asynchronous input vs synchronous input, a quick introduction
for the reason behind this.

I copy the vkCode in a queue and do the conversion from another thread
#HOOKPROC
def keyHookKFunc(code,wParam,lParam):
global gkeyQueue
gkeyQueue.append((code,wParam,kbd.vkCode))
return windll.user32.CallNextHookEx(0,code,wParam,lParam)
This has the advantage of not delaying key processing by the os

This works for me
byte[] keyState = new byte[256];
//Remove this if using
//GetKeyboardState(keyState);
//Add only the Keys you want
keysDown[(int)Keys.ShiftKey] = 0x80; // SHIFT down
keysDown[(int)Keys.Menu] = 0x80; // ALT down
keysDown[(int)Keys.ControlKey] = 0x80; // CONTROL down
//ToAscii should work fine
if (ToAscii(myKeyboardStruct.VirtualKeyCode, myKeyboardStruct.ScanCode, keyState, inBuffer, myKeyboardStruct.Flags) == 1)
{
//do something
}

Related

MapVirtualKey returns wrong chars in MAPVK_VK_TO_CHAR mode

I trying to use MapVirtualKey[A]/[W]/[ExA]/[ExW] API to map VK_* code to character by means of its MAPVK_VK_TO_CHAR (2) mode.
I have found that it always returns 'A'..'Z' chars for 'VK_A'..'VK_Z' no matter which keyboard layout I have active.
The docs are saying that:
The uCode parameter is a virtual-key code and is translated into an
unshifted character value in the low order word of the return value.
Dead keys (diacritics) are indicated by setting the top bit of the
return value. If there is no translation, the function returns 0.
But I cannot get unshifted character value nor non-ASCII character from it.
For other buttons it works as described. And this behavior is even more annoying considering that, for example for US English keyboard layout it returns:
VK_Q (0x51) -> `Q` (U+0051 Latin Capital Letter Q)
VK_OEM_PERIOD (0xbe) -> `.` (U+002E Full Stop)
But for Russian keyboard layout it returns:
VK_Q (0x51) -> `Q` (U+0051 Latin Capital Letter Q)
^- here it should return `й` (U+0439 Cyrillic Small Letter Short I) according to docs
VK_OEM_PERIOD (0xbe) -> `ю` (U+044E Cyrillic Small Letter Yu)
How to use it properly?
MapVirtualKey has a known broken behaviour.
The docs are lying you about MAPVK_VK_TO_CHAR or 2 mode.
According to experiments and leaked Windows XP source code (in \windows\core\ntuser\kernel\xlate.c file) it contains different behaviour for 'A'..'Z' VKs (those VKs are specifically not defined in Win32 API WinUser.h header and are equivalent to 'A'..'Z' ASCII chars):
case 2:
/*
* Bogus Win3.1 functionality: despite SDK documenation, return uppercase for
* VK_A through VK_Z
*/
if ((wCode >= (WORD)'A') && (wCode <= (WORD)'Z')) {
return wCode;
}
Not sure why MS decided to pull this bug from Win 3.1 but current situation on my Windows 10 is like this.
Also some keyboard layouts can emit multiple WCHAR characters on a single key press (UTF-16 surrogate pairs or ligatures that can contain multiple Unicode code points). MapVirtualKey with MAPVK_VK_TO_CHAR is failing to return proper values for these keys too - it will return U+F002 code point in this case.
As a workaround I can recommend you to use ToUnicode[Ex] API that can do this mapping for you:
// Returns UTF-8 string
std::string GetStrFromKeyPress(uint16_t scanCode, bool isShift)
{
static BYTE keyboardState[256];
memset(keyboardState, 0, 256);
if (isShift)
{
keyboardState[VK_SHIFT] |= 0x80;
}
wchar_t chars[5] = { 0 };
const UINT vkCode = ::MapVirtualKeyW(scanCode, MAPVK_VSC_TO_VK_EX);
// This call can produce multiple UTF-16 code points
// in case of ligatures or non-BMP Unicode chars that have hi and low surrogate
// See examples: https://kbdlayout.info/features/ligatures
int code = ::ToUnicode(vkCode, scanCode, keyboardState, chars, 4, 0);
if (code < 0)
{
// Dead key
if (chars[0] == 0 || std::iswcntrl(chars[0])) {
return {};
}
code = -code;
}
// Clear keyboard state
{
memset(keyboardState, 0, 256);
const UINT clearVkCode = VK_DECIMAL;
const UINT clearScanCode = ::MapVirtualKeyW(clearVkCode, MAPVK_VK_TO_VSC);
wchar_t tmpChars[5] = { 0 };
do {} while (::ToUnicode(clearVkCode, clearScanCode, keyboardState, tmpChars, 4, 0) < 0);
}
// Do not return control characters
if (code <= 0 || (code == 1 && std::iswcntrl(chars[0]))) {
return {};
}
return utf8::narrow(chars, code);
}
Or even better: if you have Win32 message loop - just use TranslateMessage() (that calls ToUnicode() under the hood) and then process WM_CHAR message.
PS: Same applies to GetKeyNameText API since it calls the MapVirtualKey(vk, MAPVK_VK_TO_CHAR) under the hood for keys that do not have explicit name set in keyboard layout dll (usually only non-characters do have names).

How to send hardware level keyboard input by program?

I have been using keybd_event for keyboard input by command line. However, want something that can send hardware level input, just like real keyboard.
Is there anything that suits my purpose?
I am using Windows as operating system.
I once used SendInput to control a game character. Game (icy tower?) was using directx input system (I think?) and somehow it was ignoring keybd_event calls but this method worked. I do not know how close to hardware you need to be but did this the trick for me. I used virtual key codes but turned them into scancodes for this answer.
UINT PressKeyScan(WORD scanCode)
{
INPUT input[1] = {0};
input[0].type = INPUT_KEYBOARD;
input[0].ki.wVk = NULL;
input[0].ki.wScan = scanCode;
input[0].ki.dwFlags = KEYEVENTF_SCANCODE;
UINT ret = SendInput(1, input, sizeof(INPUT));
return ret;
}
UINT ReleaseKeyScan(WORD scanCode)
{
INPUT input[1] = {0};
input[0].type = INPUT_KEYBOARD;
input[0].ki.wVk = NULL;
input[0].ki.wScan = scanCode;
input[0].ki.dwFlags = KEYEVENTF_SCANCODE | KEYEVENTF_KEYUP;
UINT ret = SendInput(1, input, sizeof(INPUT));
return ret;
}
To simulate press and release you use them sequentially (or you may create a separate function for press and release that use same INPUT structure).
WORD scanCodeSpace = 0x39;
PressKeyScan(scanCodeSpace);
ReleaseKeyScan(scanCodeSpace)
You can use MapVirtualKeyA to get scan code from virtual key code.
keybd_event() is deprecated, use SendInput() instead.
SendInput() posts its simulated events to the same queue that the hardware driver posts its events to, as shown in the below diagram from Raymond Chen's blog:
When something gets added to a queue, it takes time for it to come out the front of the queue

No relevant answers on the actual behavior of kbhit() on characters such as ", %, ~ in Windows 10 when keyboard and locale are US (not international)

Windows 10 with latest updates installed on a Dell XPS13. US keyboard layout and US locale selected (not international). Still a call to kbhit() or _kbhit() with specific characters such as ", ~, % does not return the key hit, at least mot until a certain amount of time (~1second) and a second character has been hit.
I try to use kbhit() because I need a non-waiting function. How can I detect correctly a keyboard hit on " or % with a single keystroke?
In Linux using a timed-out select() on stdin works great, but doesn't seem to be OK with Windows.
Thanks,
-Patrick
I finally found a solution that fits my needs and fixes the issues I have with kbhit(); code below; I hope it helps others too.
– Patrick
int getkey();
//
// int getkey(): returns the typed character at keyboard or NO_CHAR if no keyboard key was pressed.
// This is done in non-blocking mode; i.e. NO_CHAR is returned if no keyboard event is read from the
// console event queue.
// This works a lot better for me than the standard call to kbhit() which is generally used as kbhit()
// keeps some characters such as ", `, %, and tries to deal with them before returning them. Not easy
// the to follow-up what's really been typed in.
//
int getkey() {
INPUT_RECORD buf; // interested in bKeyDown event
DWORD len; // seem necessary
int ch;
ch = NO_CHAR; // default return value;
PeekConsoleInput(GetStdHandle(STD_INPUT_HANDLE), &buf, 1, &len);
if (len > 0) {
if (buf.EventType == KEY_EVENT && buf.Event.KeyEvent.bKeyDown) {
ch = _getche(); // set ch to input char only under right conditions
} // _getche() returns char and echoes it to console out
FlushConsoleInputBuffer(GetStdHandle(STD_INPUT_HANDLE)); // remove consumed events
} else {
Sleep(5); // avoids too High a CPU usage when no input
}
return ch;
}
It is also possible to call ReadConsoleInput(GetStdHandle(STD_INPUT_HANDLE), &buf, 1, &len); rather than FlushConsoleInputBuffer(GetStdHandle(STD_INPUT_HANDLE)); in the code above, but for some unknown reason, it doesn't seem to reply/react as quickly and some character are missed when typing at the keyboard.

Is there a way to forward TTN_NEEDTEXT to CFrameWnd?

I have an application that already deals with the generating of tooltips. I am modifying a CWnd derived class that has a parent frame. It doesn't implement tooltips.
From this, I can get tooltips to show up by adding the following code:
BEGIN_MESSAGE_MAP(CMyWindow, CWnd)
ON_NOTIFY_EX_RANGE(TTN_NEEDTEXTW, 0, 0xFFFF, OnToolTipNotify)
ON_NOTIFY_EX_RANGE(TTN_NEEDTEXTA, 0, 0xFFFF, OnToolTipNotify)
END_MESSAGE_MAP()
BOOL CMyWindow::OnToolTipNotify(UINT id, NMHDR* pNMHDR, LRESULT* pResult)
{
UNREFERENCED_PARAMETER(id);
UNREFERENCED_PARAMETER(pResult);
// need to handle both ANSI and UNICODE versions of the message
TOOLTIPTEXTA* pTTTA = (TOOLTIPTEXTA*)pNMHDR;
TOOLTIPTEXTW* pTTTW = (TOOLTIPTEXTW*)pNMHDR;
CStringA strTipText;
UINT_PTR nID = pNMHDR->idFrom;
if (pNMHDR->code == TTN_NEEDTEXTA && (pTTTA->uFlags & TTF_IDISHWND) ||
pNMHDR->code == TTN_NEEDTEXTW && (pTTTW->uFlags & TTF_IDISHWND))
{
// idFrom is actually the HWND of the tool
nID = ::GetDlgCtrlID((HWND)nID);
}
if (nID != 0) // will be zero on a separator
strTipText.Format("Control ID = %d", nID);
if (pNMHDR->code == TTN_NEEDTEXTA)
{
strncpy_s(pTTTA->szText, sizeof(pTTTA->szText), strTipText,
strTipText.GetLength() + 1);
}
else
{
::MultiByteToWideChar(CP_ACP, 0, strTipText, strTipText.GetLength() + 1,
pTTTW->szText, sizeof(pTTTW->szText) / (sizeof pTTTW->szText[0]));
}
return TRUE; // message was handled
}
I can use GetParentFrame() to get the frame window, so I'd like to leverage the tooltip code that is already in place so that I get a consistent look. Is there some way that I can forward the TTN_NEEDTEXT message so that it gets handled by the frame window?
In your message map, you use ON_NOTIFY_whatever, which should hint that the message that TTN_NEEDTEXT uses is WM_NOTIFY — and indeed this is the case. So you can just produce the WM_NOTIFY yourself and send it to the parent.
The documentation for WM_NOTIFY says wParam is the control's identifier and lParam is the NMHDR pointer. There's an example on the bottom of the page that shows wParam is just the idFrom member of the NMHDR, so you have everything you need to reconstruct the message:
LRESULT lr = this->GetParentFrame()->SendMessage(WM_NOTIFY, pNMHDR->idFrom, (LPARAM) pNMHDR);
When you should issue this call is up to what you need to do. In this case, you would probably want to make it the first call, override your string, and return TRUE. I'm not fully sure, though.
Note: if MFC provides a function to do this, I don't know it; I personally don't use MFC, but the concepts are the same.
So what about the return value from that SendMessage()? Well, you can return it via pResult (for instance, if you're just tweaking the parent's alterations slightly), or you can ignore it and return a custom value. But for TTN_NEEDTEXT, it won't matter; TTN_NEEDTEXT doesn't care what the LRESULT is.

Force GetKeyNameText to english

The Win32 function GetKeyNameText will provide the name of keyboard keys in the current input locale.
From MSDN:
The key name is translated according to the layout of the currently
installed keyboard, thus the function may give different results for
different input locales.
Is it possible to force the input locale for a short amount of time? Or is there another alternative to GetKeyNameText that will always return the name in English?
Update: This answer does not work. It actually modifies the keyboard settings of the user. This appear to be a behavior change between Windows versions.
CString csLangId;
csLangId.Format( L"%08X", MAKELANGID( LANG_INVARIANT, SUBLANG_NEUTRAL ) );
HKL hLocale = LoadKeyboardLayout( (LPCTSTR)csLangId, KLF_ACTIVATE );
HKL hPrevious = ActivateKeyboardLayout( hLocale, KLF_SETFORPROCESS );
// Call GetKeyNameText
ActivateKeyboardLayout( hPrevious, KLF_SETFORPROCESS );
UnloadKeyboardLayout( hLocale );
WARNING: GetKeyNameText is broken (it returns wrong A-Z key names for non-english keyboard layouts since it uses MapVirtualKey with MAPVK_VK_TO_CHAR that is broken), keyboard layout dlls pKeyNames and pKeyNamesExt text is bugged and outdated. I cannot recommend dealing with this stuff at all. :)
If you're really-really want to get this info - then you can load and parse it manually from keyboard layout dll file (kbdus.dll, kbdger.dll etc).
There is a bunch of undocumented stuff involved:
In order to get proper keyboard layout dll file name first you need to convert HKL to KLID string. You can do this via such code:
// Returns KLID string of size KL_NAMELENGTH
// Same as GetKeyboardLayoutName but for any HKL
// https://docs.microsoft.com/en-us/windows-hardware/manufacture/desktop/windows-language-pack-default-values
BOOL GetKLIDFromHKL(HKL hkl, _Out_writes_(KL_NAMELENGTH) LPWSTR pwszKLID)
{
bool succeded = false;
if ((HIWORD(hkl) & 0xf000) == 0xf000) // deviceId contains layoutId
{
WORD layoutId = HIWORD(hkl) & 0x0fff;
HKEY key;
CHECK_EQ(::RegOpenKeyW(HKEY_LOCAL_MACHINE, L"SYSTEM\\CurrentControlSet\\Control\\Keyboard Layouts", &key), ERROR_SUCCESS);
DWORD index = 0;
while (::RegEnumKeyW(key, index, pwszKLID, KL_NAMELENGTH) == ERROR_SUCCESS)
{
WCHAR layoutIdBuffer[MAX_PATH] = {};
DWORD layoutIdBufferSize = sizeof(layoutIdBuffer);
if (::RegGetValueW(key, pwszKLID, L"Layout Id", RRF_RT_REG_SZ, nullptr, layoutIdBuffer, &layoutIdBufferSize) == ERROR_SUCCESS)
{
if (layoutId == std::stoul(layoutIdBuffer, nullptr, 16))
{
succeded = true;
DBGPRINT("Found KLID 0x%ls by layoutId=0x%04x", pwszKLID, layoutId);
break;
}
}
++index;
}
CHECK_EQ(::RegCloseKey(key), ERROR_SUCCESS);
}
else
{
WORD langId = LOWORD(hkl);
// deviceId overrides langId if set
if (HIWORD(hkl) != 0)
langId = HIWORD(hkl);
std::swprintf(pwszKLID, KL_NAMELENGTH, L"%08X", langId);
succeded = true;
DBGPRINT("Found KLID 0x%ls by langId=0x%04x", pwszKLID, langId);
}
return succeded;
}
Then with KLID string you need to go to HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Keyboard Layouts\%KLID% registry path and read Layout File string from it.
Load this dll file from SHGetKnownFolderPath(FOLDERID_System, ...) (usually C:\Windows\System32) with LoadLibrary() call.
Next you need to do GetProcAddress(KbdDllHandle, "KbdLayerDescriptor") - you're receive pointer that can be casted to PKBDTABLES.
There is kbd.h header in Windows SDK that have KBDTABLES struct definition (there is some stuff involved to use proper KBD_LONG_POINTER size for x32 code running on x64 Windows. See my link to Gtk source at the end).
You have to look at pKeyNames and pKeyNamesExt in it to get scan code -> key name mapping.
Long story short: The GTK toolkit have the code that doing all this(see here and here). Actually they are building scan code -> printed chars tables from Windows keyboard layout dlls.

Resources