Does PDCurses and ncurses have the same syntax? - syntax

I'm building a game with ncurses in Linux.
Can I "copy/paste" the code into Microsoft Visual Studio (properly set for PDCurses) and everything will run OK?
Thanks!

The syntax is the same, but the question doesn't deal with syntax
They are "largely compatible", but each has features not found in the other. Offhand (no one's made a complete comparison):
PDCurses doesn't have a low-level (terminfo or termcap) interface
PDCurses has explicit definitions for alt/control keys, e.g.,
#define CTL_LEFT 0x1bb /* Control-Left-Arrow */
#define CTL_RIGHT 0x1bc
#define CTL_PGUP 0x1bd
#define CTL_PGDN 0x1be
#define CTL_HOME 0x1bf
#define CTL_END 0x1c0
With ncurses, those would be user-defined capabilities. The terminal description would have capabilities for the control cursor keys such as **kDN5 (control down-arrow) and the application find these at runtime using tigetstr (to get the values) and key_defined to find the coding used by ncurses. The names are based on xterm, but could include other terminals (most of the ones aside from rxvt that you'll find copy xterm). Sounds cumbersome, but both ncurses/PDCurses took their own path in extending X/Open Curses.
resize_term is different (in ncurses it responds to the window size changes, while PDCurses allows changing the window size).
programs written to use Unicode values (or which assume that strings are UTF-8) probably will not port without some effort.

Related

Why does the Tool Help Library offer 2 versions of same functions/structures?

I've noticed that the Tool Help Library offers some functions and structures with 2 versions: normal and ending with W. For example: Process32First and Process32FirstW. Since their documentation is identical, I wonder what are the differences between those two?
The W and A versions stand for "wide" and "ANSI". In the past they made different functions, structures and types for both ANSI and unicode strings. For the purpose of this answer, unicode is widechar which is 2 bytes per character and ANSI is 1 byte per character (but it's actually more complicated than that). By supplying both types, the developer can use whichever he wants but the standard today is to use unicode.
If you look at the ToolHelp32 header file it does include both A and W versions of the structures and functions. If you're not finding them, you're not looking hard enough, do an explicit search for the identifiers and you will find them. If you're just doing "view definition" you will find the #ifdef macros. If you still can't find them, change your character set in your Visual Studio project and check again.
Due to wide char arrays being twice the size, structure alignment will be incorrect if you do not use the correct types. Let the macros resolve them for you, by setting the correct character set and using PROCESSENTRY32 instead of indicating A or W, this is the preferred method. Some APIs you are better off using the ANSI version to be honest but that is something you will learn with experience and have to make your own decision.
Here is an excellent article on the topic of character sets / encoding

Setting label text from other header files in Visual C++ 2010 .rc file

Assume that you have a label in a legacy Visual C++ 2010 project, defined like so:
[foo.rc]
LTEXT "Foo",IDC_STATIC,42,42,42,42
In a resource (.rc) file.
Now, you want to generate the text based on constants you define in a header file, like so:
[foo.rc]
LTEXT FOO_TEXT,IDC_STATIC,42,42,42,42
Where FOO_TEXT was previously defined in some other way, for instance:
[bar.h]
#define FROBNICATE "F"
#define OO "o"
#define ICANTTHINKOFMETASYNTACTICVARIABLESBEGINNINGWITHO "o"
#define FOO_TEXT (FROBNICATE OO ICANTTHINKOFMETASYNTACTICVARIABLESBEGINNINGWITHO)
Only that that doesn't work, because .rc files are not header files, and the RC compiler complains, telling you:
[Build output]
1>foo.rc(42): error RC2116: expecting number for ID
1>
1>
1>foo.rc(42): error RC2108: expected numerical dialog constant
What would you do?
To clarify, yes, the entire string in question is known at compile-time, but it also needs to be constructed from smaller strings (in this case, version information and release category (development, release, and another one)). Of course, I could also write C++ code that does that, but that seems very inelegant to me.
So, is there a nicer way?
I don't think you will be able to achive what you want without C++ code. See the comment to this msdn article:
Don't use parens in #define
The resource compiler is very limited in its understanding of directives. So, for example, this:
#define RESTYPE_FILE (256)
will silently get ignored, while this:
#define RESTYPE_FILE 256
will work. Obviously, trying to use expressions or anything complicated like that will silently fail, leaving you wondering why you can't load that resource.

About the "Character set" option in Visual Studio

I have an inquiry about the "Character set" option in Visual Studio. The Character Set options are:
Not Set
Use Unicode Character Set
Use Multi-Byte Character Set
I want to know what the difference between three options in Character Set?
Also if I choose something of them, will affect the support for languages ​​other than English (like RTL languages)?
It is a compatibility setting, intended for legacy code that was written for old versions of Windows that were not Unicode enabled. Versions in the Windows 9x family, Windows ME was the last and widely ignored one. With "Not Set" or "Use Multi-Byte Character Set" selected, all Windows API functions that take a string as an argument are redefined to a little compatibility helper function that translates char* strings to wchar_t* strings, the API's native string type.
Such code critically depends on the default system code page setting. The code page maps 8-bit characters to Unicode which selects the font glyph. Your program will only produce correct text when the machine that runs your code has the correct code page. Characters whose value >= 128 will get rendered wrong if the code page doesn't match.
Always select "Use Unicode Character Set" for modern code. Especially when you want to support languages with a right-to-left layout and you don't have an Arabic or Hebrew code page selected on your dev machine. Use std::wstring or wchar_t[] in your code. Getting actual RTL layout requires turning on the WS_EX_RTLREADING style flag in the CreateWindowEx() call.
Hans has already answered the question, but I found these settings to have curious names. (What exactly is not being set, and why do the other two options sound so similar?) Regarding that:
"Unicode" here is Microsoft-speak for UCS-2 encoding in particular. This is the recommended and non-codepage-dependent described by Hans. There is a corresponding C++ #define flag called _UNICODE.
"Multi-Byte Character Set" (aka MBCS) here the official Microsoft phrase for describing their former international text-encoding scheme. As Hans described, there are different MBCS codepages describing different languages. The encodings are "multi-byte" in that some or all characters may be represented by multiple bytes. (Some codepages use a variable-length encoding akin to UTF-8.) Your typical codepage will still represent all the ASCII characters as one-byte each. There is a corresponding C++ #define flag called _MBCS
"Not set" apparently refers to compiling with_UNICODE nor _MBCS being #defined. In this case Windows works with a strict one-byte per character encoding. (Once again there are several different codepages available in this case.)
Difference between MBCS and UTF-8 on Windows goes into these issues in a lot more detail.

how does windows wchar_t handle unicode characters outside the basic multilingual plane?

I've looked at a number of other posts here and elsewhere (see below), but I still don't have a clear answer to this question: How does windows wchar_t handle unicode characters outside the basic multilingual plane?
That is:
many programmers seem to feel that UTF-16 is harmful because it is a variable-length code.
wchar_t is 16-bits wide on windows, but 32-bits wide on Unix/MacOS
The Windows APIs use wide-characters, not Unicode.
So what does Windows do when you want to code something like 𠂊 (U+2008A) Han Character on Windows?
The implementation of wchar_t under the Windows stdlib is UTF-16-oblivious: it knows only about 16-bit code units.
So you can put a UTF-16 surrogate sequence in a string, and you can choose to treat that as a single character using higher level processing. The string implementation won't do anything to help you, nor to hinder you; it will let you include any sequence of code units in your string, even ones that would be invalid when interpreted as UTF-16.
Many of the higher-level features of Windows do support characters made out of UTF-16 surrogates, which is why you can call a file 𐐀.txt and see it both render correctly and edit correctly (taking a single keypress, not two, to move past the character) in programs like Explorer that support complex text layout (typically using Windows's Uniscribe library).
But there are still places where you can see the UTF-16-obliviousness shining through, such as the fact you can create a file called 𐐀.txt in the same folder as 𐐨.txt, where case-insensitivity would otherwise disallow it, or the fact that you can create [U+DC01][U+D801].txt programmatically.
This is how pedants can have a nice long and basically meaningless argument about whether Windows “supports” UTF-16 strings or only UCS-2.
Windows used to use UCS-2 but adopted UTF-16 with Windows 2000. Windows wchar_t APIs now produce and consume UTF-16.
Not all third party programs handle this correctly and so may be buggy with data outside the BMP.
Also, note that UTF-16, being a variable length encoding, does not conform to the C or C++ requirements for an encoding used with wchar_t. This causes some problems such as some standard functions that take a single wchar_t, such as wctomb, can't handle characters beyond the BMP on Windows, and Windows defining some additional functions that use a wider type in order to be able to handle single characters outside the BMP. I forget what function it was, but I ran into a Windows function that returned int instead of wchar_t (and it wasn't one where EOF was a possible result).

Pantheios wide characters?

I'm trying to integrate logging into my Windows C++ application, and I wanted to use Pantheios, as it generally has very favorable comments. That said, all the examples included are using macros like PANTHEIOS_LITERAL_STRING etc., for wrapping string literals, and require typedefs like:
typedef std::basic_string<PAN_CHAR_T> string_t;
to compile correctly. I think this is ugly, and would prefer to not use these typedefs.
Here's an example: http://www.pantheios.org/doc/html/cpp_2misc_2example_8cpp_8misc_8strings_2example_8cpp_8misc_8strings_8cpp-example.html
I tried compiling Pantheios with PANTHEIOS_USE_WIDE_STRINGS disabled but get lots of build errors -- any ideas?
As you've observed the file backend assumes multibyte output in a multibyte build, and wide output in a wide build by default, but IIRC there are initialisation options (for be.file) that allow you to force it one way or the other, regardless of how you're building.
fwiw, I would think that the examples have to take into account all permutations, and that's why the "ugliness" you report is there. If you're only building for one char encoding or the other, you don't have to do that. Pretty much like examples of Windows coding that use TCHAR and all the _tcsXXX() funcs: you don't have to do that unless you're wanting your code to work with both.
HTH

Resources