Where is using null BSTR documented? - windows

It's at least common practice to treat null BSTR (null WCHAR* pointer) as an empty string and design all the code manipulating BSTRs accordingly. Answers to this question say the same.
Where is this practice documented? Is there any official document describing this convention?

Well, the link given in the accepted answer to that question is to an article by Eric Lippert, Eric's Complete Guide To BSTR Semantics. While it would most definitely not be official documentation, Lippert is a well know authority on COM (particularly in the scripting arena).
However, the official documentation has this to say:
A BSTR with no data elements is either an empty BSTR or a NULL BSTR. An empty BSTR indicates a present, but zero-length, data value. A NULL BSTR indicates a data value that is not present.
So, officially they are both BSTRs with no data elements, but with slightly different semantics (though there's nothing to say that those 2 cases need to be handled differently in your application). In this case, I'd certainly follow Lippert's advice of treating them identically. For me, his real-world experience with how actual implementations work carries more weight than the one sentence in the official BSTR doc.

Michael Burr gives what I think should be the accepted answer. It's unfortunate that the page for BSTR in MSDN doesn't document this practice.
In addition, you can infer this behavior from these pages in the MSDN documentation:
SysFreeString page reports that if bstr is null the function simply returns.
SysStringLen page reports that passing a null for the bstr parameter returns zero for the string length.
SysStringByteLen page reports the same behavior; null means zero length.
However, the documentation is not complete:
SysReAllocString does not mention what will happen if *pbstr is null.
SysReAllocStringLen does not mention what will happen if *pbstr is null.

Handling null BSTR as an empty string seams to be a common practice, but the technical documentation found at Microsoft actually states that there is a difference between those two.
For any document referencing both [MS-DTYP] and [MS-OAUT], specifying
BSTR in a wire representation context MUST be considered as a
reference to the transmitted BSTR type, while specifying BSTR in a
memory representation context MUST be considered as a reference to the
presented BSTR type ([MS-DTYP] section 2.2.5). Reflecting the
terminology used for presented BSTRs, a NULL BSTR, or NULL transmitted
BSTR, is defined as the wire representation of a NULL presented BSTR;
and an empty BSTR, or empty transmitted BSTR, is defined as the wire
representation of a zero-length presented BSTR. Preserving this
distinction in the wire representation enables clients and servers to
distinguish between NULL presented BSTRs and zero-length presented
BSTRs, and thus associate possibly different, application-specific
semantics to these two values.
https://msdn.microsoft.com/en-us/library/cc237580.aspx
So it's up to the implementations whether they handle both equal or not.

Related

std::move() is just casting?

I have read a few (pesudo) implementions of std::move(). And all are just casting away the reference of the parameter and then returning it as a rvalue reference.
It doesn't do anything more than that.
However, I am curious:
1. whether it does more than that.
2. whether standard explicitly states that so that the caller should be aware the side effect.
In the book The C++ Programming Language 4th edition, it states "move(x) marks x for destruction so that move(x) should be used with care".
Does it mean that there is a side effect imposed by the standard and so compiler can do certain optimization?
Thanks in advance.
Yes, that's exactly as it is described in the standard. In N4659 (which is last draft I found)
it says in §23.2.5
template <class T> constexpr remove_reference_t<T>&& move(T&& t) noexcept;
Returns: static_cast<remove_reference_t<T>&&>(t)
It doesn't mark anything for destruction, and it doesn't change the object but object may be changed in function that accepts rvalue (such as move constructor, move assignment operator)
Yes, std::move is a bit of a misnomer as it doesn't actually move anything. It is used to indicate that an object may be "moved from".
It does this by casting the object to a T&&. cppreference states the return value is static_cast<typename std::remove_reference<T>::type&&>(t). (btw, that is exactly what VS2017 does)
I don't know precisely what the standard says on the matter.

Move CComBSTR to std::vector?

Is there a way to move CComBSTR object to std::vector without copying the underlying string? It seems the following code doesn't work.
CComBSTR str(L"SomeStr");
std::vector<CComBSTR> vStr;
vStr.push_back((CComBSTR)str.Detach());
Your code doesn't work because Detach gives you a BSTR, not a CComBSTR. You could have used std::vector<BSTR> vStr, as long as you realise that BSTR points to the first character of the string, but there's also a length prefix before the memory it points to (see, e.g. BSTR (Automation)), so you'd need to be careful with managing it.
Alternatively, you could have used
CComBSTR str(L"SomeStr");
std::vector<CComBSTR> vStr;
vStr.push_back(str);
which would make a copy of the string.
The MSDN page on CComBSTR does not indicate it has any move-semantics, but it might have been added to the code without the documentation being updated.
I don't think so though, in which case your std::move solution in your answer will be the same as my second example: a simple copy.
I've done some experiment and seems the latest CComBSTR supports move semantics so it can be written as:
CComBSTR str(L"SomeStr");
std::vector<CComBSTR> vStr;
vStr.push_back(std::move(str));

Why should I not use identity based operations on Optional in Java8?

The java.util.Optional javadoc states that:
This is a value-based class; use of identity-sensitive operations (including reference equality (==), identity hash code, or synchronization) on instances of Optional may have unpredictable results and should be avoided.
However, this junit snippet is green. Why? It seems to contradict the javadoc.
Optional<String> holder = Optional.ofNullable(null);
assertEquals("==", true, holder == Optional.<String>empty());
assertEquals("equals", true, holder.equals(Optional.<String>empty()));
You shouldn’t draw any conclusions from the observed behavior of one simple test ran under a particular implementation. The specification says that you can’t rely on this behavior, because the API designers reserve themselves the option to change the behavior at any time without notice.
The term Value-based Class already provides a hint about the intended options. Future versions or alternative implementations may return the same instance when repeated calls with the same value are made, or the JVM might implement true value types for which identity based operations have no meaning.
This is similar to instances returned by the boxing valueOf methods of the wrapper types. Besides the guaranty made for certain (small) values, it is unspecified whether new instances are created or the same instance is returned for the same value.
“A program may produce unpredictable results if it attempts to distinguish two references to equal values of a value-based class…” could also imply that the result of a reference comparison may change during the lifetime of two value-based class instances. Think of a de-duplication feature. Since the JVM already has such a feature for the internal char[] array of Strings, the idea of expanding this feature to all instances of “value-based classes” isn’t so far-fetched.
Optional.of() makes a new Object for every non-null value. So it is very likely that comparing 2 Optionals will be two references even if the Optional refers to the same value.
However, your example shows that Optional.empty() re-uses the same singleton instance. That is probably the only time the same Optional instance is ever returned twice.

Move Semantics in Golang

This from Bjarne Stroustrup's The C++ Programming Language, Fourth Edition 3.3.2.
We didn’t really want a copy; we just wanted to get the result out of
a function: we wanted to move a Vector rather than to copy it.
Fortunately, we can state that intent:
class Vector {
// ...
Vector(const Vector& a); // copy constructor
Vector& operator=(const Vector& a); // copy assignment
Vector(Vector&& a); // move constructor
Vector& operator=(Vector&& a); // move assignment
};
Given that definition, the compiler will choose the move constructor
to implement the transfer of the return value out of the function.
This means that r=x+y+z will involve no copying of Vectors. Instead,
Vectors are just moved.As is typical, Vector’s move constructor is
trivial to define...
I know Golang supports traditional passing by value and passing by reference using Go style pointers.
Does Go support "move semantics" the way C++11 does, as described by Stroustrup above, to avoid the useless copying back and forth? If so, is this automatic, or does it require us to do something in our code to make it happen.
Note: A few answers have been posted - I have to digest them a bit, so I haven't accepted one yet - thanks.
The breakdown is like here:
Everything in Go is passed by value.
But there are five built-in "reference types" which are passed by value as well but internally they hold references to separately maintained data structure: maps, slices, channels, strings and function values (there is no way to mutate the data the latter two reference).
Your own answer, #Vector, is incorrect is that nothing in Go is passed by reference. Rather, there are types with reference semantics. Values of them are still passed by value (sic!).
Your confusion suppsedly stems from the fact your mind is supposedly currently burdened by C++, Java etc while these things in Go are done mostly "as in C".
Take arrays and slices for instance. An array is passed by value in Go, but a slice is a packed struct containing a pointer (to an underlying array) and two platform-sized integers (the length and the capacity of the slice), and it's the value of this structure which is copied — a pointer and two integers — when it's assigned or returned etc. Should you copy a "bare" array, it would be copied literally — with all its elements.
The same applies to channels and maps. You can think of types defining channels and maps as declared something like this:
type Map struct {
impl *mapImplementation
}
type Slice struct {
impl *sliceImplementation
}
(By the way, if you know C++, you should be aware that some C++ code uses this trick to lower exposure of the details into header files.)
So when you later have
m := make(map[int]string)
you could think of it as m having the type Map and so when you later do
x := m
the value of m gets copied, but it contains just a single pointer, and so both x and m now reference the same underlying data structure. Was m copied by reference ("move semantics")? Surely not! Do values of type map and slice and channel have reference semantincs? Yes!
Note that these three types of this kind are not at all special: implementing your custom type by embedding in it a pointer to some complicated data structure is a rather common pattern.
In other words, Go allows the programmer to decide what semantics they want for their types. And Go happens to have five built-in types which have reference semantics already (while all the other built-in types have value semantics). Picking one semantics over the other does not affect the rule of copying everything by value in any way. For instance, it's fine to have pointers to values of any kind of type in Go, and assign them (so long they have compatible types) — these pointers will be copied by value.
Another angle to look at this is that many Go packages (standard and 3rd-party) prefer to work with pointers to (complex) values. One example is os.Open() (which opens a file on a filesystem) returning a value of the type *os.File. That is, it returns a pointer and expects the calling code to pass this pointer around. Surely, the Go authors might have declared os.File to be a struct containing a single pointer, essentially making this value have reference semantics but they did not do that. I think the reason for this is that there's no special syntax to work with the values of this type so there's no reason to make them work as maps, channels and slices. KISS, in other words.
Recommended reading:
"Go Data Structures"
"Go Slices: Usage and Internals"
Arrays, slices (and strings): The mechanics of 'append'"
A thead on golang-nuts — pay close attention to the reply by Rob Pike.
The Go Programming Language Specification
Calls
In a function call, the function value and arguments are evaluated in
the usual order. After they are evaluated, the parameters of the call
are passed by value to the function and the called function begins
execution. The return parameters of the function are passed by value
back to the calling function when the function returns.
In Go, everything is passed by value.
Rob Pike
In Go, everything is passed by value. Everything.
There are some types (pointers, channels, maps, slices) that have
reference-like properties, but in those cases the relevant data
structure (pointer, channel pointer, map header, slice header) holds a
pointer to an underlying, shared object (pointed-to thing, channel
descriptor, hash table, array); the data structure itself is passed by
value. Always.
Always.
-rob
It is my understanding that Go, as well as Java and C# never had the excessive copying costs of C++, but do not solve ownership transference to containers. Therefore there is still copying involved. As C++ becomes more of a value-semantics language, with references/pointers being relegated to i) smart-pointer managed objects inside classes and ii) dependence references, move semantics solves the problem of excessive copying. Note that this has nothing to do with "pass by value", nowadays everyone passes objects by Reference (&) or Const Reference (const &) in C++.
Let's look at this (1) :
BigObject BO(big,stuff,inside);
vector<BigObject> vo;
vo.reserve(1000000);
vo.push_back(BO);
Or (2)
vector<BigObject> vo;
vo.reserve(1000000);
vo.push_back(BigObject(big,stuff,inside));
Although you're passing by reference to the vector vo, in C++03 there was a copy inside the vector code.
In the second case, there is a temporary object that has to be constructed and then is copied inside the vector. Since it can only be accessed by the vector, that is a wasteful copy.
However, in the first case, our intent could be just to give control of BO to the vector itself. C++17 allows this:
(1, C++17)
vector<BigObject> vo;
vo.reserve(1000000);
vo.emplace_back(big,stuff,inside);
Or (2, C++17)
vector<BigObject> vo;
vo.reserve(1000000);
vo.push_back(BigObject(big,stuff,inside));
From what I've read, it is not clear that Java, C# or Go are exempt from the same copy duplication that C++03 suffered from in the case of containers.
The old-fashioned COW (copy-on-write) technique, also had the same problems, since the resources will be copied as soon as the object inside the vector is duplicated.
Stroustrup is talking about C++, which allows you to pass containers, etc by value - so the excessive copying becomes an issue.
In Go, (like in Delphi, Java, etc) when you pass a container type, etc they are always references, so it's a non-issue. Regardless, you don't have to deal with it or worry about in GoLang - the compiler just does what it needs to do, and from what I've seen thus far, it's doing it right.
Tnx to #KerrekSB for putting me on the right track.
#KerrekSB - I hope this is the right answer. If it's wrong, you bear no responsibility.:)

How to best convert CString to BSTR to pass it as an "in" parameter into a COM method?

I need to convert a CString instance into a properly allocated BSTR and pass that BSTR into a COM method. To have code that compiles and works indentically for both ANSI and Unicode I use CString::AllocSysString() to convert whatever format CString to a Unicode BSTR.
Since noone owns the returned BSTR I need to take care of it and release it after the call is done in the most exception-safe manner posible and with as little code as possible.
Currently I use ATL::CComBSTR for lifetime management:
ATL::CComBSTR converted;
converted.Attach( sourceString.AllocSysString() ); //simply attaches to BSTR, doesn't reallocate it
interface->CallMethod( converted );
what I don't like here is that I need two separate statements to just construct the ATL::CComBSTR bound to the convertion result.
Is there a better way to accomplish the same task?
CComBSTR has overloaded constructors for both char* and wchar_t*, which make the call to SysAllocString() on your behalf. So the explicit allocation in your code snippet is actually unnecessary. The following would work just as well:
ATL::CComBSTR converted = sourceString;
interface->CallMethod(converted);
Furthermore, if you have no need to use the converted BSTR elsewhere in your code, you can perform the object construction in-place in the method call, like so:
interface->CallMethod(ATL::CComBSTR(sourceString));
The same applies to the _bstr_t class, which can be used instead of CComBSTR if you don't want a dependency on the ATL.
One of the confusing aspects of Windows programming is managing the conversion of Visual Basic style strings to/from C language style strings. It isn't that it is so difficult, it is just difficult to remember the details. It is usually not done often, and the MSDN documentation is so voluminous that it is difficult to find answers to your questions. But, the worst part is that you could perform some typecast that compiles fine, but doesn't work the way you expect. This results in code that doesn't work, and the bugs are hard to track down. After some experience, you learn to make sure your string conversions are doing what you expect.
C strings are arrays of characters terminated by a NULL character. Visual Basic strings differ in that the length of the string precede the characters in the string. So, a VB string knows its own length. In addition, all VB strings are Unicode (16 bits per character).
String Types
BSTR/C String conversions are required if:
You are doing COM programming in C/C++
You are writing multiple language applications, such as C++ DLL's accessed by Visual Basic applications.
One of _bstr_t constructors allows you to simply attach to existing BSTR so that you can have the exception that you want from CString::AllocSysString when BSTR allocation fails.
// _bstr_t simply attaches to BSTR, doesn't reallocate it
interface->CallMethod( _bstr_t(sourceString.AllocSysString(), false) );
The _bstr_t constructor documentation says:
_bstr_t(
BSTR bstr,
bool fCopy
);
fCopy
If false, the bstr argument is attached to the new object without making a copy by calling SysAllocString.
On the other hand, CComBSTR constructor doesn't seem to have the corresponding signature; although it can be used as well if BSTR allocation failure exception is not really needed, as mentioned by Phil Booth in his answer.

Resources