load dynamic lib several times into multiple independent scopes - macos

I want to dynamically load a library multiple times into independent scopes, so that each instance has its own memory. Is that possible?
I guess not in a portable way. Is it possible with dlopen and friends on POSIX/Unix/Linux? Or at least I care about MacOSX for my specific case right now (whereby I might need it later on other systems, too).
Background: The lib I want to use was not designed to be multithreading safe. However, it should work fine if each thread just uses an independent instance of the lib.
More background: It is the readline lib. Adding multithreading support there basically would mean to rewrite the whole thing.

so that each instance has its own memory.
Depends on what you mean by "its own memory". Obviously, with POSIX threads, all memory is shared, so an instance of the library can not have "its own memory".
What you probably meant though is "so that each instance has its own copy of global variables", to which the answer is yes: see dlmopen(3) docs. You'll want to pass LM_ID_NEWLM to it.
Beware: this is Linux and Solaris only, and GDB doesn't know anything about libraries loaded into non-default linker space, so debugging problems is currently very hard.

Related

How to specify the physical CoreIDs used for "CLOSE" when specifying OMP_PROC_BIND?

We are trying to optimize HPC applications using OpenMP on a new hardware platform. These applications need precise placement/pinning of their cores or performance falls in half. Currently, we provide the user a custom GOMP_CPU_AFFINITY map for each platform, but this is cumbersome, because it's different on each hardware version, and even platforms with different firmware versions sometimes change their CoreID physical mappings - all things impossible for the user to detect on the fly.
It would be a great help if HPC applications could simply set GOMP_PROC_BIND to "close" and OpenMP would do the right thing for the given platform - but to make this possible, the hardware vendor would need to define what "close" means for each machine. We'd like to do this, but we can't tell how/where OpenMP gets CoreID lists to use for things like close, spread, etc. (For various external requirements, the CoreID spatial pattern on this machine would appear utterly random to a software writer.)
Any advice as to where/how OpenMP defines the CoreID lists for OMP_PROC_BIND so we could configure them? We are comfortable with the idea that we might need a custom version of OpenMP (with altered source code) for this platform if needed.
Thanks, everyone. :)
Jeff
Expanding on what #VictorEijkhout said...
You seem have invented an envirable that I can't find anywhere with Google (GOMP_PROC_BIND), with the OpenMP standard envirable (OMP_PROC_BIND). If GOMP_PROC_BIND exists the name suggests that it is a GNU feature. Note too that one of the two Google hits for GOMP_PROC_BIND says "Code that reads the setting is buggy. Setting is invalid and ignored at runtime." So, if you are setting that it is unsurprising that it has no effect!
I will therefore answer for the more general case of OMP_PROC_BIND.
The binding of OpenMP threads to logicalCPUs clearly has to be done at runtime, since, beyond its ISA, the compiler has no knowledge of the hardware on which the compiled code will run. Therefore you need to be looking at the runtime library code.
I have not looked at GNU's libgomp, but, where it can, LLVM's libomp uses the hwloc library to explore the machine hardware. Since hwloc also includes other useful tools for machine exploration (such as lstopo) it is likely that your effort is best invested in ensuring good hwloc support on your machine, at which point there will be no need to delve inside the OpenMP runtime.

Are builtin driver always prioritized over loadable modules?

According to this note:
When multiple built-in modules (especially drivers) provide the same
capability, they're prioritized by link order specified by the order
listed in Makefile.
Furthermore:
However, the order in this file is
indeterministic (depends on filesystem listing order of installed
modules). This causes confusion.
The solution is two-parted. This patch updates kbuild such that it
generates and installs modules.order which contains the name of
modules ordered according to Makefile.
What happens if a system has multiple drivers providing the same capabilities from which some are built-ins and others are loadable modules?
Which one is prioritized in this case? Is it always the built-in? And how can I change the priority (if this is possible)?
I thought about reordering them in modules.alias or modules.order but this wouldn't work I guess since built-ins are not listed there - right?
I found the answer in the meanwhile and I documented it here:
http://0x0001.de/linux-driver-loading-registration-and-binding
To make a long story short:
Yes, builtin drivers are in general prioritized over loadable drivers.
Just because they are registered first and "first comes first serves" principle while binding.
I don't think there is priority. If you have same drivers instances (one is from builtin and the other from kernel module), eventually you will be ended up with compilation error or module loading error since duplicate definition or something.
If you have "DIFFERENT" drivers on the same hardware, not sure why you are doing that??
Also, if somebody already probed and created devices, the later one cannot do the same thing since there will be conflict.
If you are simply asking "precedence" between builtin module and LKM, definitely, builtin module will be first. Kernel module are sitting on different memory location than kernel. So, LKM is loaded later than kernel.
So, if you are loading the same driver with two different ways at the same time, LKM will have problem because of conflict.

(Windows) How to make auto-update function for plugin?

I have a question. I'm writing a plugin (.dll) for an application (.exe). And I want to code auto-update function for my plugin but I catched an issue, it could not apply. Because my plugin has loaded while application is running, in run-time it cannot replace. It just apply until application has exited. So, how can I do it?
This is my code: http://codepad.org/4a22ccMa
Thanks!
(this answers the original question, which has been edited to something completely different)
It depends upon the operating system (which I guess might be Windows, since you speak of DLLs; on Linux you have shared objects (in ELF) with a different semantics). Read Levine's book Linkers and Loaders.
You might read more about Dynamic Software Updates. This is a entire research subject with a lot of scientific literature about it. Read e.g. the paper about Kitsune: Efficient, General-purpose Dynamic Software Updating for C (at least to understand several issues in your question).
On Linux, you could rename(2) the old .so and dlopen(3) the new one (and probably dlclose the older one, but you should do that later, when no active call frame on the call stack points into the old plugin), and my manydl.c example shows that you practically can dlopen a big lot (more than a million in practice) of shared objects.
On Windows (which I don't know) you probably need to dynamically load the new version of the plugin in a different file path. Probably, restarting your program after a plugin update should make things much easier.
(if you can afford that, switching to Linux might be very helpful, because I guess that it is much easier)
Notice that in some languages (and some of their implementations) replacing some code is much easier than in C or C++ on Windows. I guess that in CLR (managed code, e.g. in C#) or in a JVM (e.g. in Java, Scala, Clojure) it should be easier. And in Common Lisp (at least with SBCL) it is quite easy (in particular since Common Lisp is an homoiconic language).
Take care of the continuation, i.e. of the call stack. You'll understand that your question (also related to orthogonal persistence & application checkpointing) is much deeper and more difficult than you have imagined. And upgrading classes and their instances is also very difficult.

How to defeat framework injections?

Is anyone hardening their code in an attempt to detect injections? For example, if someone is trying to intercept a username/password via NSUrlConnection, they could use LD_PRELOAD/DYLD_LIBRARY_PATH, provide exports for my calls into NSUrlConnection, and then forward the calls to the real NSUrlConnection.
Ali gave excellent information below, but I'm trying to determine what measures should be take for a hostile environment, where a phone might be jail broken. Most applications don't have to care, but one class of apps do - high integrity software.
If you are hardening, what method(s) are you using? Is there a standard way to detect injections on Macs and iPhones? How are you defeating framework injections?
For iOS / CocoaTouch, loading dynamic libraries is not allowed* (except for the System frameworks). To build and distribute an Application thru the AppStore, you can only link with static libraries and system frameworks, no dynamic library.
So on iOS you can't use that for code injection, neither can you use LD_PRELOAD of course (as you don't have access to such environment variables on iOS).
Except for jailbroken iPhones probably, but people jailbreaking their iPhone should take upon themselves that jailbreaking is by definition lifting all securities provided by iOS to avoid things such as injections (so you can't expect to remove the lock on your door to avoid having to use your key… and still expect that you're still protected against thieves robbing your house ;-))
That's the advantage of the Sandboxing + CodeSigning + No dylib constraints on iOS. No Code injection possible.
(On OSX it is still possible anyway, inparticular using LD_PRELOAD)
[EDIT] Since iOS8, iOS also allows dynamic frameworks. But as that's still sandboxed (you can only load code-signed frameworks that are inside your application bundles, and can't load frameworks that comes from outside your app bundle) injection is still not possible*
*except if the user jailbreaks its phone but it means that s/he chose to get rid of all protections and purpose and thus put its phone at risk — we can't crack our phone security and still expect it to provide all the protections those securities provided
This is an answer specific to UNIX like operating systems, I apologize if it doesn't make sense for your question but I don't know your platform well. Simply don't create a dynamically linked executable.
There are two ways I can think of to do this. Method #2 is probably best for you. They're both similar.
Important for both, the executable must be statically compiled using -static at build time
Method 1 - static exe, manual load shared libraries by their trusted full paths
Manually dlopen each library you need via a full path and then get the function addresses via dlsym at runtime and assign them to function pointers to use them. You'll need to do this for every external function you want to use. I believe reentrant unsafe functions won't like this so for those that use static variables- you'll need to use the reentrant safe versions, these end with "_r" i.e. use strtok_r instead of strtok
This will be difficult or simple depending on what your app does and how many functions you're using.
Method 2 - Statically link the executable, period
You can solve your subversion problem by just linking a static executable to avoid using dynamic libraries at all. This will generate a much larger exe than the the dlopen()/dlsym() method. Build using the -static compile flag and instead of using, for example gcc bah.c -o bah lssl use gcc -static bah.c -o bah /usr/lib/libssl.a to use the statically compiled version of the libraries you need instead of the dynamic shared libraries. In other words, use -static and don't use -l while building
For either method:
Once built, use file bah to confirm the executable is statically linked. Or confirm by running ldd on it
Note you'll need statically compiled versions of all the libraries you're linking against present in your system. These files end with.a instead of .so)
Also note upgrading system libraries will not update your executable. If there's a new security bug in OpenSSL, you'll need to get the latest libssl.a and recompile it. If you use the dlopen()/dlsym() method you won't have this problem but you will have portability issues if symbols change in different versions
Each method has its pros and cons based on your needs.
Taking the method 1 dlopen and dlsym approach makes your code more "obfuscated" and smaller, but sacrifices portability in most cases so probably isn't what you want. The upside is that it can possibly benefit when security bugs are fixed system wide.

Is it possible to write a libPOSIX for Windows (Win32) without requiring a background service or DLL that's always loaded?

I know about Cygwin, and I know of its shortcomings. I also know about the slowness of fork, but not why on Earth it's not possible to work around that. I also know Cygwin requires a DLL. I also understand POSIX defines a whole environment (shell, etc...), that's not really what I care about here.
My question is asking if there is another way to tackle the problem. I see more and more of POSIX functionality being implemented by the MinGW projects, but there's no complete solution providing a full-blown (comparable to Linux/Mac/BSD implementation status) POSIX functionality.
The question really boils down to:
Can the Win32 API (as of MSVC20??) be efficiently used to provide a complete POSIX layer over the Windows API?
Perhaps this will turn out to be a full libc that only taps into the OS library for low-level things like filesystem access, threads, and process control. But I don't know exactly what else POSIX consists of. I doubt a library can turn Win32 into a POSIX compliant entiity.
POSIX <> Win32.
If you're trying to write apps that target POSIX, why are you not using some variant of *N*X? If you prefer to run Windows, you can run Linux/BSD/whatever inside Hyper-V/VMWare/Parallels/VirtualBox on your PC/laptop/etc.
Windows used to have a POSIX compliant environment that ran alongside the Win32 subsystem, but was discontinued after NT4 due to lack of demand. Microsoft bought Interix and released Services For Unix (SFU). While it's still available for download, SFU 3.5 is now deprecated and no longer developed or supported.
As to why fork is so slow, you need to understand that fork isn't just "Create a new process", it's "create a new process (itself an expensive operation) which is a duplicate of the calling process along with all memory".
In *N*X, the forked process is mapped to the same memory pages as the parent (i.e. is pretty quick) and is only given new pages as and when the forked process tried to modify any shared pages. This is known as copy on write. This is largely achievable because in UNIX, there is no hard barrier between the parent and forked processes.
In NT, on the other hand, all processes are separated by a barrier enforced by CPU hardware. In NT, the easiest way to spawn a parallel activity which has access to your process' memory and resources, is to create a thread. Threads run within the memory space of the creating process and have access to all of the process' memory and resources.
You can also share data between processes via various forms of IPC, RPC, Named Pipes, mailslots, memory-mapped files but each technique has its own complexities, performance characteristics, etc. Read this for more details.
Because it tries to mimic UNIX, CygWin's 'fork' operation creates a new child process (in its own isolated memory space) and has to duplicate every page of memory in the parent process within the newly forked child. This can be a very costly operation.
Again, if you want to write POSIX code, do so in *N*X, not NT.
How about this
Most of the Unix API is implemented by the POSIX.DLL dynamically loaded (shared) library. Programs linked with POSIX.DLL run under the Win32 subsystem instead of the POSIX subsystem, so programs can freely intermix Unix and Win32 library calls.
From http://en.wikipedia.org/wiki/UWIN
The UWIN environment may be what you're looking for, but note that it is hosted at research.att.com, while UWIN is distributed under a liberal license it is not the GNU license. Also, as it is research for att, and only 2ndarily something that they are distributing for use, there are a lot of issues with documentation.
See more info see my write-up as the last answer for Regarding 'for' loop in KornShell
Hmm main UWIN link is bad link in that post, try
http://www2.research.att.com/sw/download/
Also, You can look at
https://mailman.research.att.com/pipermail/uwin-users/
OR
https://mailman.research.att.com/pipermail/uwin-developers/
To get a sense of the features vs issues.
I hope this helps.
The question really boils down to: Can the Win32 API (as of MSVC20??)
be efficiently used to provide a complete POSIX layer over the Windows
API?
Short answer: No.
"Complete POSIX" means fork(), mmap(), signal() and such, and these are [almost] impossible to implement on NT.
To drive the point home: GNU Hurd has problems with fork() as well, because Hurd kernel is not POSIX.
NT is not POSIX too.
Another difference is persisence:
In POSIX-compliant systems it is possible to create system objects and leave them there. Examples of such objects are named pipes and shared memory objects (shms). You can create a named pipe or a shm, and leave it in the filesystem (or in a special filesystem-like place) where other processes will be able to access it. The downside is that a process might die and fail to clean up after itself, leaving unused objects behind (you know about zombie processes? same thing).
In NT every object is reference-counted, and is destroyed as soon as its last handle is closed. Files are among the few objects that persist.
Symlinks are a filesystem feature, and don't exactly depend on NT kernel, but current implementation (in Vista and later) is incapable of creating object-type-agnostic symlinks. That is, a symlink is either a file or a directory, and must link to either a file or a directory. If the target has wrong type, the symlink won't work. You can give it the right type if the target exists when you create the symlink, but POSIX requires that symlinks may be created without their target existing. I can't imagine a use-case for a symlink that points first to a file, then to a directory, but POSIX says that this should work, and if it doesn't, you're not completely POSIX-compliant. Or if your symlinking API/utility can be given an option that specifies the right type, when target doesn't exist, that also breaks POSIX compatibility.
It is possible to replicate some POSIX features to some degree (such as "integer descriptors from in a single namespace, referencing any I/O object, and being select()able" without sacrificing [much] performance, but that is still a major undertaking, and POSIX interface is really restrictive (that is, if you could just add one more argument to that function, it would have been possible to Do The Right Thing...but you couldn't, unless you want to throw POSIX compliance away).
Your best bet is to not to rely on POSIX features that are difficult to port to non-POSIX systems, or abstract in such a way that lower levels may have separate implementations for different OSes, and upper levels do not care about the details.

Resources