Display Arabic text in Rich TextBox

Display Arabic text in Rich TextBox - vb6

Hi i am using VB6 and I have a field in my DB which stores Arabic text like "ÓÔÄÇåì". Now I want to display this in RichTextbox. I have set RichTextBox1.Font.CharSet property to 178 and RichTextBox1.Font.Name to Arial. But still it's not displaying in the correct format which is "سشؤاهى". Please help.

You can use the Text Object Model (tom) type library to get Unicode access to the RichEdit control wrapped within a RichTextBox.
You can use API calls to get Unicode access to the same thing.
You can replace usage of the RichTextBox with another control, for example the InkEdit control that ships as part of Windows beginning in Vista. Turn off the ink capture capability and you have a Unicode RTF box via the normal properties (.Text, .TextRTF, etc.).
All of this presumes that your database actually contains Unicode text.

Related

Support "styled text" in a scriptable Mac application (Cocoa Scripting)

My app supports being scripted with Applescript.
I am trying to make styled text content, stored in NSAttributedString objects, available to an Applescript user.
I thought I could simply deliver styled text with the NSAttributedString class, just like I deliver plain text with the NSString class, but that does not work - Cocoa Scripting then reports that it cannot convert or coerce the data.
I wonder if I'm missing something or if this is just plain impossible with the standard classes supported by Cocoa Scripting?
AppleScript does know the "styled text" type, as seen in this example:
set stxt to "foo" as styled text
So, if AppleScript knows this type by default, shouldn't the Cocoa Scripting engine support it as well somehow?

As always there are many choices for solving an AS problem.
In my scriptable text editor (Ted), I implemented the Text Suite, which is based on rich text (NSTextStorage, a subclass of NSMutableAttributedString). I wanted to be able to script tabs in my paragraphs, so I added a style record, which contains all the paragraph style information. This lets me write scripts like this:
tell application "Ted"
set doc1 to make new document at beginning with properties {name:"Document One"}
tell doc1
set p1 to make new paragraph at end with data "Paragraph One" with properties {size:24, color:maraschino}
set p2 to make new paragraph at end with data "Paragraph Two" with properties {style:style of paragraph 1}
set color of paragraph 1 to blue
end tell
set doc2 to make new document at beginning with properties {name:"Document Two"}
copy p1 to beginning of doc2
properties of paragraph 1 of doc2
end tell
Since p1 is rich text, the second document ends up with both the text and formatting of the first paragraph of the first document.
You can also ask for the properties of a piece of text, where I have implemented the usual Text Suite properties, as well as a "style" property for paragraph style (backed by NSParagraphStyle, since I wanted to be able to script the tab stops):
properties of paragraph 1 of doc2
Result:
{height:60.0, italic:false, size:24, style:{paragraph spacing after:0.0, head indent:0.0, line break mode:0, alignment:4, line spacing:0.0, minimum line height:0.0, first line head indent:0.0, paragraph spacing before:0.0, tabs:{"28L", "56L", "84L", "112L", "140L", "168L", "196L", "224L", "252L", "280L", "308L", "336L"}, tail indent:0.0, maximum line height:0.0, line height multiple:0.0, default tab interval:0.0}, color:blue, width:164.109375, font:"Helvetica", bold:false, class:attribute run}
This works well for passing rich text within my application, but may not be as useful for passing styled text to other applications, which may be what you wanted to do. I think adding a "style" property (of type record) is probably the best way to convey style info for use in other scriptable apps. Then in the second app, the scripter can make use of any properties in the style record that the second app understands.

It looks like there is no implicit support for styled text in AppleScript. And there is also no common interchange record type for passing styled text.
AppleScript was developed in the pre-OSX days when styled text was often represented by a combination of a plain text (in System or MacRoman encoding) and a styl resource. With Unicode came an alternative format of a ustl style format. These are still used with the Carbon Pasteboard API (PasteboardCreate etc.) today. Yet, none of these seem to have made it into the use with AppleScript.
The fact that AppleScript knows of a styled text type has no special meaning. Even its class is just text.
Update
I just found that Matt Neuburg's book "AppleScript The Definitive Guide" mentions styled text and gives an example where it's indeed showing a record containing both the plain text (class ktxt) and style data (class ksty) with data of type styl, just as I had expected above. He also points out that most applications don't use that format, though.
So, it appears using a record with style resource data is indeed the intended way, only that hardly anyone knows about it. Go figure.

Has anyone heard of this strange bug with the standard Windows message box?

Years ago, I was messing around with Visual Basic and I discovered a bug with the MsgBox function. I tried searching for it, but nobody had ever said anything about it. It's not just with Visual Basic though; it's with anything that uses the standard Windows MessageBox API call.
The bug is triggered when the title text has more than one character, and the first character is a lowercase 'y' with an umlaut ('ÿ'). What's so special about this character? It almost definitely not the character itself, but rather its ASCII value that's special. 'ÿ' is character 255 (0xFF), meaning it's the highest value that can be stored in an unsigned byte, and all its bits are set to 1.
What does this bug do? Well, there are two different possibilities, which depend on the number of characters in the title text. If there are an even number of characters (unless it's 2) in the title text, no message box appears, and you just hear the alert sound. If there are two characters in the title text, or any odd number other than 1 (in which case the bug wouldn't be triggered)...then this happens:
And that's not all--the message will also be truncated to one line. It seems like the kind of bug that would occur in at least one semi-high-profile incident, considering how often this API call is used. Are there any reports of this on the Internet, or anything showing what could cause it? Maybe it's a Unicode-related glitch, like that "bush hid the facts" glitch in Notepad?
I made a program in case you want to play around with this; download it here.
Alternatively, copy the following into Notepad, save it with a .vbs extension, and double-click it to display the dialog box seen above:
MsgBox "Windows 3.1 font, anyone?", 0, "ÿ ODD NUMBER!"
Or for a different font:
MsgBox "I CAN HAS CHEEZBURGER?", 0, "ÿ HImpact"
EDIT: It seems that if the first four characters are ÿ's, it doesn't ever display the message, even if there's an odd number of characters.

This is a bug with dialog templates generally. It is not a message box bug as such.
For example, in Visual Studio create the default win32 application. In the .rc file, change the caption in the template for the about box from
CAPTION "About sampleapp"
to
CAPTION "ÿT"
and the bug will manifest itself when you display the about box.
In the DLGTEMPLATEEX documentation note that the menu and class name have type sz_Or_Ord which means either a null-terminated string or 0xFFFF followed by a single word resource identifier.
Windows incorrectly applies a similar scheme to the dialog title: if the first character is 0xFF then it treats the title as being two WORDs long, but only when it is trying to locate the font information. When it is displaying the title it correctly treats the title as a string.
In other words, Windows is looking for the font information inside the title string. In most case this won't specify a valid font, so Windows defaults to the system font.
To prove this, I constructed a dialog template in memory (based on this). Once this was working I deleted the code that writes the font information to the template and used the dialog title "ÿa\xd\x200\x21SimSun". This displays the dialog in italic SimSun because windows is reading the font information from the title string.
This bug is likely a hangover from 16-bit Windows, where (I guess) 0xFF was used as the resource ID marker.

A strange bug. I suspect the symptoms are the result of the way the MessageBox() actually displays the dialog.
Internally, MessageBox() builds a dialog template dynamically. If you look at the description of a DLGTEMPLATE structure you'll find the following nugget of information:
In a standard template for a dialog box, the DLGTEMPLATE structure is
always immediately followed by three variable-length arrays that
specify the menu, class, and title for the dialog box. When the
DS_SETFONT style is specified, these arrays are also followed by a
16-bit value specifying point size and another variable-length array
specifying a typeface name.
So, the in-memory layout of a dialog template has the font specification immediately following the dialog box title.
Visual Basic does not use Unicode and so the function you're calling is actually MessageBoxA(). This is simply a thunk that converts the passed-in strings from multibyte to Unicode and then calls MessageBoxW().
I believe what's happening is that, for some reason, the conversion of that string from multibyte to Unicode is either going wrong, or returning a spurious length value. This has the knock-on effect, when the dialog template is built in memory, of corrupting the memory immediately following the title string - which, as we know, is the font specification.

How to display unicode Arabic string in VS output window?

I have a uni-code string in Arabic to display in output window rather than in console, so I could only use OutputDebugStringW, and I call SetConsoleOutputCP(1256) to set Arabic code page but still it only output "????". What should I do...

This is a documented restriction for OutputDebugStringW():
OutputDebugStringW converts the specified string based on the current system locale information and passes it to OutputDebugStringA to be displayed. As a result, some Unicode characters may not be displayed correctly.
Calling SetConsoleOutputCP() doesn't solve the problem, that changes the code page for the console window, not the debugger. You'd have to change your system locale, Control Panel + Region, Administrative tab. If Arabic is your favorite language then changing it to 1256 is the appropriate thing to do. It will of course have system-wide effects.

How to display unicode control characters in visual studio text visualizer?

I get some text string from service, which contains Unicode control characters
(i.e \u202B or \u202A and others for Arabic language support).
But while debugging I can't see them in default text visualizer. So I need to enable display for such characters to determine which of them my text consists of. There is checkbox in text visualizer "show all characters", but it doesn't work as I expect.
Any suggestions?
Thanks in advance

Those are codes for explicit RLE and LRE order, ie if in RLE something should be displayed in LRE order.
http://unicode.org/reports/tr9/#Directional_Formatting_Codes

What does the charset parameter of CreateFont exactly set?

The code page on my Windows is set to ANSI (Latin1, Windows-1252).
I create a font with CreateFont and pass RUSSIAN_CHARSET in fdwCharSet
This is what I experience:
Windows controls (like a Static for example) using this font ignore the font's character set: the string passed to SetWindowTextA is displayed with Latin characters
After selecting this font on the DC, GDI text functions (Ext)TextOutA and DrawTextA use the character set of the font. Strings passed to them are displayed with cyrillic letters.
Why? When is the charset parameter of the font taken into account and when is it ignored? Can I force windows controls to use the font's character set?

You will have to convert the text to Unicode and call SetWindowTextW() instead of SetWindowTextA().
Make sure the window's class is registered with RegisterClassW() and not RegisterClassA(). This is what really determines if a window is Unicode. You can use IsWindowUnicode() to verify that the window is really Unicode.
Make sure you pass unhandled messages to DefWindowProcW() and not DefWindowProcA().
Or, if the window is a dialog, just make sure it is created with CreateDialogW() or DialogBoxParamW().

>Can I force windows controls to use the font's character set?
AFAIK no, you can't.
SetWindowTextA just converts the argument to Unicode, then calls SetWindowTextW: both windows kernel, shell, and GDI are unicode.
To convert the argument to Unicode, SetWindowTextA uses the setting from Window's regional options ("Language for non-Unicode programs").

Here is what is happening:
You call SetWindowText with something like "\xC4\xEE\xE1\xF0\xEE\xE5 xF3\xF2\xF0\xEE".
You're compiled as an ANSI application rather than Unicode, so that maps to a call to SetWindowTextA.
SetWindowTextA sees that the window was created in ANSI mode, so it sets the string directly. (If it had been a Unicode window, then it would converts the ANSI input string to Unicode and passes it onto SetWindowTextW.)
The ANSI window converts the ANSI string to Unicode so that it can display it. But it doesn't know the string is in a different code page than the default one for the system. It's this conversion that's changing everything back to Latin characters. It assumes that the input string is in the process's default code page (Windows 1252 in your case). So now you have a bunch of accented Latin characters instead of a string of Cyrillic ones.
The control tries to display this Unicode string using something like DrawTextW or TextOutW.
The lower level part of the system says, "Oh crap, this string is a bunch of Latin characters, but the user has chosen a Cyrillic font." To solve the problem, it uses font linking (or font fallback, I get those terms confused) to effectively select a font compatible with 1252.
You see Latin gobbledygook instead of proper Russian.
I tried coming up with a minimal way to do what you need, but I failed. My first idea was to do the conversion yourself and call SetWindowTextW directly:
void SetWindowTextRussian(HWND hwnd, char *pszCyrillic) {
const int cchCyrillic = ::lstrlen(pszCyrillic);
const int cchUnicode = 4 * cchCyrillic; // worst case
WCHAR *pszUnicode = new WCHAR[cchUnicode];
// See: http://msdn.microsoft.com/en-us/library/dd317756(v=vs.85).aspx
const UINT CP_CYRILLIC = 1251;
if (::MultiByteToWideChar(CP_CYRILLIC, 0, pszCyrillic, -1,
pszUnicode, cchUnicode) > 0) {
::SetWindowTextW(hwnd, pszUnicode);
}
delete [] pszUnicode;
}
But this doesn't work. I suspect that since the window was created as an ANSI window, the Unicode string is converted back to ANSI (assuming the wrong code page again), and then you get question marks instead of Latin nonsense.
I think you're going to have to convert to a Unicode app, or only run with the default code page set to 1251.
Update: If you control the creation of the window (e.g., you call CreateWindow directly rather than having a dialog box instantiate the controls), then you can probably make the above work by calling CreateWindowW directly and creating a Unicode window for the controls that matter.

Consider hooking gdi32full.dll GetCodePage to select the code page that you need. CP_UTF8 for example. It has single pointer param, returns single DWORD (the code page) and stdcall calling convention.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio