Access native system APIs when implementing a language with LLVM - user-interface

I'm interested in learning about compilers and their creation, so I've been looking into various tools such as LLVM. It seems like a great framework to work with, but I'm a little confused how you can access native APIs with it.
Specifically, I'm interested in creating a language that has GUI or at least a windowing system built in. LLVM doesn't seem to wrap that functionality, so would I manually need to write assembly that called the APIs provided by each system (e.g. Win32)?
For example, the Red language claims to have a "Cross-platform native GUI system" built in. I assume they manually wrote the backend for that which used different system calls depending on the current system, or piggy backed on Rebol which did that instead.
Is such a thing possible or viable when using LLVM, which does a lot of the backend abstraction for you?

LLVM does not have an API geared toward abstraction of the use APIs. What you CAN do is write a runtime library for your language, and then use LLVM to generate runtime calls as needed. I have some experimentation and found that I preferred to write a runtime in C++ and then create some C bindings. The C bindings are necessary because C++ name mangling will make it very difficult to link against your runtime library, whereas with C the name of a symbol in a shared lib will be the same as that of the function.

Related

How to create an embeddable C-API library in Go?

I am planning to write a cross-platform app that has most of its functionality shared across all platforms (Linux, OS X, Windows, iOS, Android).
These are mostly helper function (calculations, internal lists, networking etc.) so I figured it would be convenient to have those functions in a library I can compile for every platform while still being able to create custom UI for each platform individually.
Dominant languages across those platforms I mentioned are C, Objective-C, C# and Java. All these languages support calling C-API functions from a library either directly or via internal wrappers. Since I don't want to write 80% of my application's code in C/C++, I searched and found Go.
cgo seems to be the solution for my problem.
My current thought is to code the core library in Go and then compile it for each platform, however, invoking go build does not create anything at all.
I import "C".
I have declared a func and added the //export statement before.
I read about gccgo but people keep pointing out that it is outdated and should not be used.
Maybe anyone can point out a flaw in my thoughts or help me bring this library file together. Thanks in advance.
If your aim is to build a library that can be linked into arbitrary C, Objective-C or Java programs, you are out of luck with the currently released standard tool chain. There are plans to change this in the future, but at present the Go runtime is not embeddable in other applications.
While cgo will allow you to export functions to be called from C, this is only really useful for cases when the C code you call from Go needs to call back to Go.

Can gobject be used as a language runtime in the same way cocoa can?

Now that swift has been released by Apple I have have been thinking the posibility of using gobject as a runtime for existing languages such as rust or even swift.
My main concern is that while vala does this, it compiles to c before and needs language bindings even if the library that you are trying to use already uses gobject and even then somehow vala cant use features that c doesn't support such as function overloading, while objective-c doesn't support it, but swift does and can still be used with it.
On the good side both runtime systems have many similarities, such as using reference counting, having signals and being more dynamic than the average
You can view GObject (and in extension GLib and the ecosystem of GLib based libraries) as a common language runtime for several languages:
Vala / Genie
Gjs (then GNOME version of ECMAScript / JavaScript)
C
C++
Python (through PyGObject)
Probably others, really any language that can talk to the C API
Actually it really is an extension to the C runtime (which is the core common language runtime of most Operating Systems) that adds OOP support.
There are other such technologies like the Java JVM the .NET CLR and as you describe Apple is using the Objective C runtime now for multiple languages as well.
There is (in principle) nothing that prevents someone to write a Rust or Swift compiler that does something similar to Vala (emits C code and uses GObject as it's object system).
About your concern:
Vala could as well emit object code directly (without the intermediate "compile to C" step).
There are some advantages to the concept the valac is written at the moment though:
You can take the emitted C files and use them in a C program without the need to have valac installed
It's much easier for Vala to consume C files and
The foreign function interface (called VAPI in Vala) was designed to make it easy to consume C libraries and abstract from common C idioms (like zero terminated strings, passing array with a length parameter, etc.)
The generated C code can be optimized by the C compiler
Standard C tools can be used to inspect the generated C code
You can actually read the C code to see what Vala does internally (big plus for people that already know C)
Vala uses C as a higher level "assembly" language.

Do DLLs built with Rust require libgcc.dll on run time?

If I build a DLL with Rust language, does it require libgcc*.dll to be present on run time?
On one hand:
I've seen a post somewhere on the Internet, claiming that yes it does;
rustc.exe has libgcc_s_dw2-1.dll in its directory, and cargo.exe won't run without the dll when downloaded from the http://crates.io website;
On the other hand:
I've seen articles about building toy OS kernels in Rust, so they most certainly don't require libgcc dynamic library to be present.
So, I'm confused. What's the definite answer?
Rust provides two main toolchains for Windows: x86_64-pc-windows-gnu and x86_64-pc-windows-msvc.
The -gnu toolchain includes an msys environment and uses GCC's ld.exe to link object files. This toolchain requires libgcc*.dll to be present at runtime. The main advantage of this toolchain is that it allows you to link against other msys provided libraries which can make it easier to link with certain C\C++ libraries that are difficult to under the normal Windows environment.
The -msvc toolchain uses the standard, native Windows development tools (either a Windows SDK install or a Visual Studio install). This toolchain does not use libgcc*.dll at either compile or runtime. Since this toolchain uses the normal windows linker, you are free to link against any normal Windows native libraries.
If you need to target 32-bit Windows, i686- variants of both of these toolchains are available.
NOTE: below answer summarizes situation as of Sep'2014; I'm not aware if it's still current, or if things have changed to better or worse since then. But I strongly suspect things have changed, given that 2 years have already passed since then. It would be cool if somebody tried to ask steveklabnik about it again, then update below info, or write a new, fresher answer!
Quick & raw transcript of a Rust IRC chat with steveklabnik, who gave me a kind of answer:
Hi; I have a question: if I build a DLL with Rust, does it require libgcc*.dll to be present on run time? (on Windows)
I believe that if you use the standard library, then it does require it;
IIRC we depend on one symbol from it;
but I am unsure.
How can I avoid using the standard library, or those parts of it that do? (and/or do you know which symbol exactly?)
It involves #[no_std] at your crate root; I think the unsafe guide has more.
Running nm -D | grep gcc shows me __gc_personality_v0, and then there is this: What is __gxx_personality_v0 for?,
so it looks like our stack unwinding implementation depends on that.
I seem to recall I've seen some RFCs to the effect of splitting standard library, too; are there parts I can use without pulling libgcc in?
Yes, libcore doesn't require any of that.
You give up libstd.
Also, quoting parts of the unsafe guide:
The core library (libcore) has very few dependencies and is much more portable than the standard library (libstd) itself. Additionally, the core library has most of the necessary functionality for writing idiomatic and effective Rust code. (...)
Further libraries, such as liballoc, add functionality to libcore which make other platform-specific assumptions, but continue to be more portable than the standard library itself.
And fragment of the current docs for unwind module:
Currently Rust uses unwind runtime provided by libgcc.
(The transcript was edited slightly for readability. Still, I'll happily delete this answer if anyone provides something better formatted and more thorough!)

How to defeat framework injections?

Is anyone hardening their code in an attempt to detect injections? For example, if someone is trying to intercept a username/password via NSUrlConnection, they could use LD_PRELOAD/DYLD_LIBRARY_PATH, provide exports for my calls into NSUrlConnection, and then forward the calls to the real NSUrlConnection.
Ali gave excellent information below, but I'm trying to determine what measures should be take for a hostile environment, where a phone might be jail broken. Most applications don't have to care, but one class of apps do - high integrity software.
If you are hardening, what method(s) are you using? Is there a standard way to detect injections on Macs and iPhones? How are you defeating framework injections?
For iOS / CocoaTouch, loading dynamic libraries is not allowed* (except for the System frameworks). To build and distribute an Application thru the AppStore, you can only link with static libraries and system frameworks, no dynamic library.
So on iOS you can't use that for code injection, neither can you use LD_PRELOAD of course (as you don't have access to such environment variables on iOS).
Except for jailbroken iPhones probably, but people jailbreaking their iPhone should take upon themselves that jailbreaking is by definition lifting all securities provided by iOS to avoid things such as injections (so you can't expect to remove the lock on your door to avoid having to use your key… and still expect that you're still protected against thieves robbing your house ;-))
That's the advantage of the Sandboxing + CodeSigning + No dylib constraints on iOS. No Code injection possible.
(On OSX it is still possible anyway, inparticular using LD_PRELOAD)
[EDIT] Since iOS8, iOS also allows dynamic frameworks. But as that's still sandboxed (you can only load code-signed frameworks that are inside your application bundles, and can't load frameworks that comes from outside your app bundle) injection is still not possible*
*except if the user jailbreaks its phone but it means that s/he chose to get rid of all protections and purpose and thus put its phone at risk — we can't crack our phone security and still expect it to provide all the protections those securities provided
This is an answer specific to UNIX like operating systems, I apologize if it doesn't make sense for your question but I don't know your platform well. Simply don't create a dynamically linked executable.
There are two ways I can think of to do this. Method #2 is probably best for you. They're both similar.
Important for both, the executable must be statically compiled using -static at build time
Method 1 - static exe, manual load shared libraries by their trusted full paths
Manually dlopen each library you need via a full path and then get the function addresses via dlsym at runtime and assign them to function pointers to use them. You'll need to do this for every external function you want to use. I believe reentrant unsafe functions won't like this so for those that use static variables- you'll need to use the reentrant safe versions, these end with "_r" i.e. use strtok_r instead of strtok
This will be difficult or simple depending on what your app does and how many functions you're using.
Method 2 - Statically link the executable, period
You can solve your subversion problem by just linking a static executable to avoid using dynamic libraries at all. This will generate a much larger exe than the the dlopen()/dlsym() method. Build using the -static compile flag and instead of using, for example gcc bah.c -o bah lssl use gcc -static bah.c -o bah /usr/lib/libssl.a to use the statically compiled version of the libraries you need instead of the dynamic shared libraries. In other words, use -static and don't use -l while building
For either method:
Once built, use file bah to confirm the executable is statically linked. Or confirm by running ldd on it
Note you'll need statically compiled versions of all the libraries you're linking against present in your system. These files end with.a instead of .so)
Also note upgrading system libraries will not update your executable. If there's a new security bug in OpenSSL, you'll need to get the latest libssl.a and recompile it. If you use the dlopen()/dlsym() method you won't have this problem but you will have portability issues if symbols change in different versions
Each method has its pros and cons based on your needs.
Taking the method 1 dlopen and dlsym approach makes your code more "obfuscated" and smaller, but sacrifices portability in most cases so probably isn't what you want. The upside is that it can possibly benefit when security bugs are fixed system wide.

Windows API functions

Are they standard code c or c++ code? what are they?
The original Win32 API is C-based. There are however a substantial number of services within Windows that are COM based. Good examples are the clipboard, drag+drop, the shell, the user mode driver framework, DirectX. While it is technically possible to write COM code in C, it is excruciatingly painful to do so.
Realistically you use C++ there. And a C++ class library to make the original C-based API less painful, especially for GUI code.
They're standard C code, if you're programming against pure Windows API.
A C++ based wrapper called MFC is available.
All of this is being pushed out in favor of .NET framework.
The standard Windows API is a C library. Wrappers exist for other languages (C++, etc).
Just read it on wikipedia.
Windows API is langugage neutral. It is neither C nor C++. Microsoft says that Windows itself is written mainly in C++, but you don't need any classes for vast majority of the API and even classes in API (e.g. in Direct X) can be used in pure C without classes.
Although some C programmers think it is a C library, compiler of a programming language must support proprietary Windows calling model, it is not a classic C calling convention. (Obviously almost each real world C compiler supports it nowadays.)

Resources