How does COM achieve language interop? - windows

I understand how COM can achieve compiler agnostic C++ code, since it defines an ABI by being careful what features of the C++ language to use. It's just C++ code talking to C++ code in a really clever way. However I still don't understand how it can allow for language interop with C# or Javascript for example.
Where is the boundary? The only explanation I have right now is that the language compiler itself must have special support for COM so that it can generate the proper assembly code to allow for accurate communication between caller/callees.

Since you've tagged your question with WinRT, I assume you're asking specifically about how this is achieved by WinRT language projections. In that case, all languages must have some way to map their natural language constructs to the COM ABI that WinRT defines. That ABI is derived from metadata encoded in the ECMA 335 standard and special rules are applied to transform the abstract metadata into a concrete ABI. There are naturally different ways to achieve this. The CLR itself was updated to support WinRT in C#. The Visual C++ compiler was (sadly) updated with language extensions to support WinRT via C++/CX. The C++/WinRT approach is very different in that it requires only a standard C++ compiler and all of the knowledge about WinRT is delivered via a standard C++ header-only library. Other languages might take different approaches, but at the end of the day they must agree on the way that types expressed in metadata are transformed into objects and virtual function calls on the ABI based on COM.
And while this process is not well documented at the moment, C++/WinRT is one of the only open source language projections and thus acts as a useful reference implementation for those who need to understand how WinRT works under the hood.
https://github.com/microsoft/cppwinrt

The "Type Library" is what enables interop of COM components between different languages.
https://learn.microsoft.com/en-us/windows/desktop/midl/com-dcom-and-type-libraries
A type library (.tlb) is a binary file that stores information about a
COM or DCOM object's properties and methods in a form that is
accessible to other applications at runtime. Using a type library, an
application or browser can determine which interfaces an object
supports, and invoke an object's interface methods. This can occur
even if the object and client applications were written in different
programming languages. The COM/DCOM run-time environment can also use
a type library to provide automatic cross-apartment, cross-process,
and cross-machine marshaling for interfaces described in type
libraries.
The other approach for language interop (e.g. C++ projecting objects to Javascript) is that a COM object can implement IDispatch.

It’s not magic, of course.
COM sets up rules for language interop. It’s just a contract, with some helpful tooling. Each language that wants to support COM has to find a way to abide by the rules on its own. They all have to provide their own compatible mechanism one way or another.
In the case of C++, the rules appear to come for free as you mentioned, but be aware there is one caveat: the language standard does not specify the layout and mechanism of classes and virtual functions. The method mimicked by COM is one extremely common implementation of virtual calling (“the VTable”), and COM follows the exact layout used by the Microsoft compiler. But you can have a perfectly valid C++ compiler where classes with virtual functions would not be compatible with the COM layout. It’s just that nobody does that, at least not in Windows compilers. So even in C++ there is some “meeting in the middle” by the compiler.
In C, you have to do the whole thing by hand. Other languages might allow you to do the same thing (assembler of course).
To help compiled languages exchange information about specific contracts, COM provides Type Libraries and mechanisms to read them. A compiler or language that want to take advantage of them also has to “meet in the middle” and learn how to process them (for example, the Microsoft C++ #import directive; the VB6 Libraries menu).
No every language will support everything you can do in COM, because there is a point (in more obscure features) where the return on investment in implementing support in the language doesn’t pan out. Each language has to pick its own limitations. There is plenty of stuff that you can do in COM (read the IDL specs) that VB6 cannot do.
Because following COM rules in a script-like language is between inpractical and impossible, COM offers a higher-level approach (Automation) that is more amenable to dynamic languages, even if more limited. But a language implementer that wants to provide client support for Automation has to implement an understanding of the IDispatch interface, an activation mechanism, and a translation to its language’s proper facilities. And a scripting language wanting to provide support for creating COM servers has to work even harder to implement a valid COM IDispatch implementation and a standalone host engine on behalf of the user scripts. Even VBScript couldn’t do this at the beginning, until Microsoft added .SCR support with the Windows Scripting Host. “Meeting in the middle” again.
If a language wants to support both pure COM and Automation, they need to work double hard; support for one does not automatically give you support for the other.
For .NET languages like C#, most of the work is done for both native COM and Automation inside of the .NET Runtime, which provides the implementation of the COM Callable Wrappers (CCW) and Runtime Callable Wrappers (RCW) necessary to interact with COM, and handling the conflicts between the Reference Count approach of COM and the GC approach of .NET. Microsoft did all the work in one place so individual .NET language designers didn’t have to.
So, yes, the language implementer has to work extra to give the language special support for COM: following the binary layout rules, implementing a translation layer when needed, and/or possibly providing tooling to read Type Libraries.
Language Interop requires both sides (the caller and the callee) to “meet in the middle” somewhere. COM is just a specification that gives designers that middle ground, “a place where all can meet”.

Related

Access native system APIs when implementing a language with LLVM

I'm interested in learning about compilers and their creation, so I've been looking into various tools such as LLVM. It seems like a great framework to work with, but I'm a little confused how you can access native APIs with it.
Specifically, I'm interested in creating a language that has GUI or at least a windowing system built in. LLVM doesn't seem to wrap that functionality, so would I manually need to write assembly that called the APIs provided by each system (e.g. Win32)?
For example, the Red language claims to have a "Cross-platform native GUI system" built in. I assume they manually wrote the backend for that which used different system calls depending on the current system, or piggy backed on Rebol which did that instead.
Is such a thing possible or viable when using LLVM, which does a lot of the backend abstraction for you?
LLVM does not have an API geared toward abstraction of the use APIs. What you CAN do is write a runtime library for your language, and then use LLVM to generate runtime calls as needed. I have some experimentation and found that I preferred to write a runtime in C++ and then create some C bindings. The C bindings are necessary because C++ name mangling will make it very difficult to link against your runtime library, whereas with C the name of a symbol in a shared lib will be the same as that of the function.

Universal Windows Platform Apps and C++/CLI (VS 2015 RC1)

I have some C++/CLI code which derives from the .NET System Namespace classes.
Is there a way to reuse this code for Universal Windows Platform Apps?
I can't get a reference to the System Namespace in C++, though in C# it is possible. It looks like there is only support for C++/Cx code and not for managed C++/CLI.
The syntax and keywords of the C++/CX extension resembles C++/CLI a great deal. But that's where similarity ends, they have nothing whatsoever in common. C++/CX gets compiled directly to native code, just like native C++. But C++/CLI is compiled to MSIL, the intermediate language of .NET. Their syntax looks so similar because they both solve the same problem, interfacing C++ to a foreign type system. .NET's in the case of C++/CLI, WinRT in the case of C++/CX.
Which is the basic reason why you cannot use the System namespace, it is a .NET namespace. You instead use the std namespace, along with the Platform and Windows namespaces for WinRT specific types. The compiler cannot import .NET reference assemblies with the /ZW compile option in effect, only WinRT metadata files, the ones with the .winmd filename extension. Which are an extension to the COM .tlb type library file format, the ones you previously had to import with the #import directive.
Which in itself is another major source for confusion, the internal .winmd file format was based on the format of .NET metadata. Most .NET decompilers can show you the content of a .winmd file because of this. But again just a superficial similarity, it is completely unrelated to a .NET assembly. It can just contain declarations, not code. Best to compare it to a .h file you'd use in a native C++ project. Or a .tlb file if you previously had exposure to COM.
Knowing how COM works can be very helpful to grok what this is all about. It is in fact COM that lies at the core of WinRT, the basic reason why your C++/CX project can be easily used by a program written in a completely different language like Javascript or VB.NET. A WinRT app is actually an out-of-process COM server. A class library or WinRT component is actually an in-process COM server. COM object factories work differently, the scope is limited to the files named in the package manifest. C++/CX is part of the language projection that hides COM, along with the C++ libraries you link that implement the Platform namespaces. WinRT would be still-born if programmers had to write traditional COM client code. You still can in native C++, the WRL library does little to hide the plumbing.
WinRT readily supports code written in a managed language like C# or VB.NET, the language projection is built into the framework and highly invisible. But not C++/CLI, a structural limitation. A Store/Phone/Universal app targets a subset of the .NET Framework named .NETCore. Better known these days as CoreCLR, the parts that were open-sourced. Which does not support module initializers, critical to C++/CLI.
Enough introduction and getting to the answer: no, you have no use for your C++/CLI code and you'll have to rewrite it. You'll have a decent shot at porting the native C++ code that your C++/CLI wrapper interfaced with, as long as it observes the api limitations. You should always start there first, given that it is easy to do and instantly tells you if your native C++ code is using verboten api functions, the kind that drains a battery too quickly or violates the sandbox restrictions.
The ref class wrappers however have to be significantly tweaked. Little reason to assume that will be a major obstacle, it could still structurally be similar. Biggest limitations are the lack of support for implementation inheritance, a COM restriction, and having to replace the code that used .NET Framework types with equivalent C++ code. The typical hangup is that there tends to be a lot of it, the original author would normally have favored the very convenient .NET types over the standard C++ library types. YMMV.

Hat operator usage in Windows Phone 8 native projects

I'm coming from C# background, learning C++, specifically on the Windows Phone 8 platform.
Many code samples (installed with the SDK) show usage of the Hat operator ^ (Reference here: Types that wear hats).
For example:
void PhoneDX::Initialize(CoreApplicationView^ applicationView)
{
// ... function body
}
I am wondering:
Why are most of the pointers being defined in that manner, specifically on Windows Phone 8 ?
Is that syntax mandatory? Suppose i am using a C++ native library from another platform (that doesn't use this syntax). Should it work with no issues?
A hat is a compiler-supported smart pointer type that is designed to make Windows Runtime types easier to work with from C++ code. As discussed in "Types That Wear Hats" and the other articles in that series, the C++/CX language extensions are optional: any code that can be written using C++/CX can be written in C++ without using the language extensions, albeit at greater code complexity and verbosity.
The key here is that hats are designed to facilitate code that makes use of Windows Runtime types. In general, you should confine your use of C++/CX and Windows Runtime types to the boundary of your components: most of your code should be standard, portable, normal C++ code. C++/CX should be used (1) to wrap C++ code to make it consumable through the Windows Runtime and (2) to use other Windows Runtime components from your component.
So, yes, the syntax is optional, but you should strongly consider using it when writing code that must work with Windows Runtime types. You should be able to use any ordinary C++ code, without modification, with the caveat that Windows Store apps and Windows Phone apps run with low privileges and some facilities are not available (e.g., there is no console, so console I/O doesn't work, and the runtime provides specialized process lifetime management facilities, so calling exit is a bad idea).
You might have a harder time grasping it since it's somewhat of a compound leap from (1) C# references to (4) C++/CX hats, with a stop in the middle for (2) C++ pointers then (3) reference counted objects.
The ^ smart pointers are a language extension, not part of standard C++, for handling the Windows Runtime types (which are reference counted)
so answering your points:
If you're seeing them a lot with Windows Phone 8, it's because ^ is used for Windows Runtime types, which would be used a lot in that platform samples (since the samples are trying to demonstrate that platform's api and features).
You would need to use conventions for that library, which would probably require that you use the types it defines (if it has its own smart pointers) or standard/regular pointers (i.e. *) or Standard Library smart pointers (i.e. shared_ptr).
Some concepts that should help you understanding this would be the lifetime of C++ objects, deterministic destruction (vs. waiting for a garbage collector to kick in), reference counting, stack/static vs. heap/dynamic allocation of objects.

Windows API functions

Are they standard code c or c++ code? what are they?
The original Win32 API is C-based. There are however a substantial number of services within Windows that are COM based. Good examples are the clipboard, drag+drop, the shell, the user mode driver framework, DirectX. While it is technically possible to write COM code in C, it is excruciatingly painful to do so.
Realistically you use C++ there. And a C++ class library to make the original C-based API less painful, especially for GUI code.
They're standard C code, if you're programming against pure Windows API.
A C++ based wrapper called MFC is available.
All of this is being pushed out in favor of .NET framework.
The standard Windows API is a C library. Wrappers exist for other languages (C++, etc).
Just read it on wikipedia.
Windows API is langugage neutral. It is neither C nor C++. Microsoft says that Windows itself is written mainly in C++, but you don't need any classes for vast majority of the API and even classes in API (e.g. in Direct X) can be used in pure C without classes.
Although some C programmers think it is a C library, compiler of a programming language must support proprietary Windows calling model, it is not a classic C calling convention. (Obviously almost each real world C compiler supports it nowadays.)

IFileOpenDialog and IFileSaveDialog from VBA

How do you call IFileOpenDialog and IFileSaveDialog from VBA?
According to Microsoft, applications written for Windows 7 and later should use IFileOpenDialog/IFileSaveDialog API calls instead of GetOpenFileName/GetSaveFileName (see Using the Common File Dialog). This is especially important for full Library support.
Short answer: it's probably not worth the effort.
Longer answer: the CFD interfaces don't extend IDispatch, which makes them impossible to call via late binding from VBA. That doesn't mean they can't be called from VBA, but it means they require a typelib to describe the "shape" of the IUnknown-based CFD interfaces. Unfortunately, Microsoft doesn't provide the CFD interface definitions in a typelib. You can roll your own typelib by reverse-engineering the header files (or try to find the original IDL in the SDK), but you'd then have to register that typelib on every machine you want to use it on (the tools for which are not shipped on the machine, unlike regsvr32 for COM stuff). Assuming you did all that, you could then reference the typelib from VBA, and conditionally call it on Vista or higher OSes. You could also shim through to a small .NET assembly that would create a System.Windows.Forms.FileDialog-derived type and marshal the results back to VBA- that would be much easier, but still more-or-less require that you register the assembly on every machine (or use C++/CLI or other hacks to export a managed DLL function), and it requires you to take a .NET dependency.
They sure didn't make it easy... :) Good luck!

Resources