How wsprintf knows the length of lpFmt string? - winapi

wsprintf uses _cdecl calling convention just like printf. The latter pops from the stack an address of null-terminated format-string. But winapi definition of wsprintf uses LPCTSTR type, e.g. no null at the end.
I am wondering, how the length of the LPCTSTR lpFmt being computed? I mean, the function should stop reading the format buffer at some point. And it does. And it works.

LPCTSTR is null-terminated. It is
const char*
or
const wchar_t*
depending on whether or not you target Unicode. But either way, it is null-terminated.

LPCTSTR for UNICODE is defined as LPCWSTR and in msdn docs you can read that LPCWSTR is defined as:
A pointer to a constant null-terminated string of 16-bit Unicode
characters. For more information, see Character Sets Used
so wsprintf reads format until it finds L'\0' character. It is actually not written explicitly in wsprintf doc.

you are confusing many things
_cdecl defers from __stdcall by how to push arguments in the stack, it has nothing to do with NULL terminated strings.
then LPCTSTR, LPSTR, char* wchar *, all are NULL terminated strings (the difference is that some of them are Unicode while others as Ansi
other type of string that micrsoft uses it (and which is not mentioned here) is BSTR, BSTR is not a terminated string, its len is strored in buff[-1]; (BSTR is a string of 16bits chars, as Unicode)

Related

What is the data type STR in winapi?

I'm reading a windows programming book and this is written at 18p.
STR data types are string data types, with storage already allocated. This data type is
less common than the LPSTR. STR data types are used when the string is supposed to be
treated as an immediate array, and not as a simple character pointer.
LPSTR stands for "Long Pointer to a STR", and is essentially defined as such: #define STR * LPSTR;
(I think second citation has a typo, the author might intend typedef STR * LPSTR; by #define STR * LPSTR;.)
There is no explanation about what is the STR and how it is defined. This msdn page also doesn't have a definition of it. And definition of the LPSTR is different from the book: typedef CHAR *LPSTR;
I think the book says STR is an array of character, and I think microsoft says STR is the same as CHAR which is defined by typedef char CHAR; How should I think about it?

Is ZeroMemory the windows equivalent of null terminating a buffer?

For example I by convention null terminate a buffer (set buffer equal to zero) the following way, example 1:
char buffer[1024] = {0};
And with the windows.h library we can call ZeroMemory, example 2:
char buffer[1024];
ZeroMemory(buffer, sizeof(buffer));
According to the documentation provided by microsoft: ZeroMemory Fills a block of memory with zeros. I want to be accurate in my windows application so I thought what better place to ask than stack overflow.
Are these two examples equivalent in logic?
Yes, the two codes are equivalent. The entire array is filled with zeros in both cases.
In the case of char buffer[1024] = {0};, you are explicitly setting only the first char element to 0, and then the compiler implicitly value-initializes the remaining 1023 char elements to 0 for you.
In C++11 and later, you can omit that first element value:
char buffer[1024] = {};
char buffer[1024]{};

Conver QString to BSTR and vice versa

I want to convert QString to BSTR and vice versa.
This is what i try to convert QString to BSTR :
std::wstring str_ = QString("some texts").toStdWString();
BSTR bstr_ = str_.c_str();
and to convert BSTR to QString :
BSTR bstr_;
wchar_t *str_ = bstr_;
QString qstring_ = QString::fromWCharArray(str_);
Is this correct? In other words is there any data lose? If yes, what is the
correct solution?
You should probably use SysAllocString to do this - BSTR also contains length prefix, which is not included with your code.
std::wstring str_ = QString("some texts").toStdWString();
BSTR bstr_ = SysAllocString(str_.c_str());
Other than that there isn't anything to be lost here - Both BSTR and QString use 16-bit Unicode encoding, so converting between each other should not modify internal data buffers at all.
To convert a BSTR to a QString you can simply use the QString::fromUtf16 function:
BSTR bstrTest = SysAllocString(L"ConvertMe");
QString qstringTest = QString::fromUtf16(bstrTest);
BSTR strings consist on two parts: four bytes for the string length; and the content it self which can contain null characters.
The short way to do it would be:
Convert QString to a two-byte null terminated string using QString::utf16. Do not use toWCharArray, a wide char is different on windows (two bytes) and linux (four bytes) (I know COM is microsoft tech, but better be sure)
Use SysAllocString to create a BSTR string that contains the string length already.
Optionally free the BSTR string with SysFreeString when you are done using it. Please read the following article to know when you need to release.
https://learn.microsoft.com/en-us/cpp/atl-mfc-shared/allocating-and-releasing-memory-for-a-bstr?view=vs-2017
BSTR bstr = ::SysAllocString(QString("stuff").utf16())
// use it
::SysFreeString(bstr)
To convert from BSTR to QString, you can reinterpret-cast BSTR to a ushort pointer, and then use QString::fromUtf16. Remember to free the BSTR when you are done with it.
QString qstr = QString::fromUtf16(reinterpret_cast<ushort*>(bstr));
The next useful article explains BSTR strings very well.
https://www.codeproject.com/Articles/13862/COM-in-plain-C-Part
BSTR oldStr;
QString newStr{QString::fromWCharArray(oldStr)};

How do you convert a 'System::String ^' to 'TCHAR'?

i asked a question here involving C++ and C# communicating. The problem got solved but led to a new problem.
this returns a String (C#)
return Marshal.PtrToStringAnsi(decryptsn(InpData));
this expects a TCHAR* (C++)
lpAlpha2[0] = Company::Pins::Bank::Decryption::Decrypt::Decryption("123456");
i've googled how to solve this problem, but i am not sure why the String has a carrot(^) on it. Would it be best to change the return from String to something else that C++ would accept? or would i need to do a convert before assigning the value?
String has a ^ because that's the marker for a managed reference. Basically, it's used the same way as * in unmanaged land, except it can only point to an object type, not to other pointer types, or to void.
TCHAR is #defined (or perhaps typedefed, I can't remember) to either char or wchar_t, based on the _UNICODE preprocessor definition. Therefore, I would use that and write the code twice.
Either inline:
TCHAR* str;
String^ managedString
#ifdef _UNICODE
str = (TCHAR*) Marshal::StringToHGlobalUni(managedString).ToPointer();
#else
str = (TCHAR*) Marshal::StringToHGlobalAnsi(managedString).ToPointer();
#endif
// use str.
Marshal::FreeHGlobal(IntPtr(str));
or as a pair of conversion methods, both of which assume that the output buffer has already been allocated and is large enough. Method overloading should make it pick the correct one, based on what TCHAR is defined as.
void ConvertManagedString(String^ managedString, char* outString)
{
char* str;
str = (char*) Marshal::StringToHGlobalAnsi(managedString).ToPointer();
strcpy(outString, str);
Marshal::FreeHGlobal(IntPtr(str));
}
void ConvertManagedString(String^ managedString, wchar_t* outString)
{
wchar_t* str;
str = (wchar_t*) Marshal::StringToHGlobalUni(managedString).ToPointer();
wcscpy(outString, str);
Marshal::FreeHGlobal(IntPtr(str));
}
The syntax String^ is C++/CLI talk for "(garbage collected) reference to a System.String".
You have a couple of options for the conversion of a String into a C string, which is another way to express the TCHAR*. My preferred way in C++ would be to store the converted string into a C++ string type, either std::wstring or std::string, depending on you building the project as a Unicode or MBCS project.
In either case you can use something like this:
std::wstring tmp = msclr::interop::marshal_as<std::wstring>( /* Your .NET String */ );
or
std::string tmp = msclr::interop::marshal_as<std::string>(...);
Once you've converted the string into the correct wide or narrow string format, you can then access its C string representation using the c_str() function, like so:
callCFunction(tmp.c_str());
Assuming that callCFunction expects you to pass it a C-style char* or wchar_t* (which TCHAR* will "degrade" to depending on your compilation settings.
That is a really rambling way to ask the question, but if you mean how to convert a String ^ to a char *, then you use the same marshaller you used before, only backwards:
char* unmanagedstring = (char *) Marshal::StringToHGlobalAnsi(managedstring).ToPointer();
Edit: don't forget to release the memory allocated when you're done using Marshal::FreeHGlobal.

IDebugSymbols::GetNameByOffset and overloaded functions

I'm using IDebugSymbols::GetNameByOffset and I'm finding that I get the same symbol name for different functions that overload the same name.
E.g. The code I'm looking up the symbols for might be as follows:
void SomeFunction(int) {..}
void SomeFunction(float) {..}
At runtime, when I have an address of an instruction from each of these functions I'd like to use GetNameByOffset and tell the two apart somehow. I've experimented with calling SetSymbolOptions toggling the SYMOPT_UNDNAME and SYMOPT_NO_CPP flags as documented here, but this didn't work.
Does anyone know how to tell these to symbols apart in the debugger engine universe?
Edit: Please see me comment on the accepted answer for a minor amendment to the proposed solution.
Quote from dbgeng.h:
// A symbol name may not be unique, particularly
// when overloaded functions exist which all
// have the same name. If GetOffsetByName
// finds multiple matches for the name it
// can return any one of them. In that
// case it will return S_FALSE to indicate
// that ambiguity was arbitrarily resolved.
// A caller can then use SearchSymbols to
// find all of the matches if it wishes to
// perform different disambiguation.
STDMETHOD(GetOffsetByName)(
THIS_
__in PCSTR Symbol,
__out PULONG64 Offset
) PURE;
So, I would get the name with IDebugSymbols::GetNameByOffset() (it comes back like "module!name" I believe), make sure it is an overload (if you're not sure) using IDebugSymbols::GetOffsetByName() (which is supposed to return S_FALSE for multiple overloads), and look up all possibilities with this name using StartSymbolMatch()/EndSymbolMatch(). Not a one liner though (and not really helpful for that matter...)
Another option would be to go with
HRESULT
IDebugSymbols3::GetFunctionEntryByOffset(
IN ULONG64 Offset,
IN ULONG Flags,
OUT OPTIONAL PVOID Buffer,
IN ULONG BufferSize,
OUT OPTIONAL PULONG BufferNeeded
);
// It can be used to retrieve FPO data on a particular function:
FPO_DATA fpo;
HRESULT hres=m_Symbols3->GetFunctionEntryByOffset(
addr, // Offset
0, // Flags
&fpo, // Buffer
sizeof(fpo), // BufferSize
0 // BufferNeeded
));
and then use fpo.cdwParams for basic parameter size discrimination (cdwParams=size of parameters)

Resources