Does exporting everything from DLL affect performance

Does exporting everything from DLL affect performance - windows

To do unit testing I had to export many small, internal classes which were never intended for consumption by the clients of my DLL.
I know that each exported function results in a stub in executable image and that Windows loader has to perform fix-up on those stubs if DLL is not loaded at its preferred location.
Someone suggested building DLL as a static lib, solely for the purposes of unit testing.
I wonder if it is worth the trouble? I could find no reference to how significant a problem of exporting every class from DLL may be, or is there any significant gain in loader performance and memory consumption if I am selective about it.
I think I read somewhere that GCC compiler exports everything by default.
EDIT: since the stated motivation for the question is disputable, let me rephrase it:
Should I go through my DLLs and remove DLLEXPORT on all classes that are not exposed to its clients? Let's say I am working with a bunch of legacy DLLs and I noticed they have a lot of unnecessary exports. Will that improve the speed of loading? Specifically on Windows 7 and 8 using MSVC version 9+.

Does exporting everything from DLL affect performance?
It probably does, however the effect is immeasurably small. I made a python script that creates a test DLL exporting > 50,000 symbols. It consists of 1024 exported classes that each contain 48 functions (16 member functions, 16 virtuals and 16 static functions). The compiler also generates about 4-5 exports for each class that appear to be things like the vtable.
I measured load time of the application using SysInternals ProcMon. The load time on the very ancient underpowered test machine before linking the DLL was between 15-30ms. Adding the DLL, and one call to each of the ~50,000 exported functions resulted in no measurable change.
This is not a completely conclusive test, but it is good enough to convince me that the symbol resolution and fix-ups are probably an order of magnitude or more faster than any other limiting factor.
Interestingly, to be able to create such an insane DLL with the Microsoft tools required adding the /bigobj compiler flag and it appears there is also limit of 64K exported symbols in the PE format of a DLL. Furthermore, the static (compile time) compile and link phases for the DLL and application each took many minutes and used a lot of memory.
So you will be pushing on all kinds of other limits before you get to loader performance problems.
Let's say I am working with a bunch of legacy DLLs and I noticed they have a lot of unnecessary exports. Will that improve the speed of loading?
Nope.
Should I go through my DLLs and remove DLLEXPORT on all classes that are not exposed to its clients?
It depends.
Not simply because of load performance. If this was so critical to the application, then presumably somebody would be benchmarking the startup and would know exactly where the performance problems were. We shouldn't guess about performance impacts:
"We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil." -- Knuth
However, there may be other reasons for not exporting those "internal" classes and functions. The point to exporting a class/function is so that client code can use it. It should match the DLL's logical external API. If that wasn't the case for a function or class, then it shouldn't have been exported. If there is a lot of functionality in internal classes that cannot be used, or tested, without going through the DLL's public API, it makes one wonder why that functionality exists? If the intent was to create generic reusable classes, perhaps they should be in a library of their own?
Test Driven Design doesn't mean you have to go around exposing everything publicly. And DLL exports are not necessarily required for even the most invasive white-box unit testing of a class. For example, the unit test fixture could be built monolithically and statically link (or even directly include the sources) to whatever internal classes are needed.
Conversely, a completely excusable explanation for having done it this way may be simply that it was easy and simple to implement. If everything else is essentially equal (modulo style and some theoretical architecture concerns), it is also poor form to needlessly change and disrupt a system that was already done a certain way and is working fine.
So this may be a design that should not be copied or extended, and maybe it is worth cleaning it up whenever maintenance or refactoring opportunities come up.
I think I read somewhere that GCC compiler exports everything by default.
Mingw LD documentation concurs. Although, note that it says that if you export with __declspec or .DEF files, this auto-export behavior is disabled.

Related

How resource files in Windows work and why use them?

I'm getting frustrated at every tutorial/book using resource files but never explaining why they use them or how they work under the hood. I haven't been able to find any information whatsoever on this. I'm hoping someone could create a public answer for everyone to find later.
Some relevant information may include...
What is the rationale behind resource files?
Are they a feature or Windows or language compilers?
Why should you not just create a GUI via code only?
Are there situations where only a resource file can be used or only code can be used?
How do resource file entries get converted in to actual window's objects at runtime?
What exactly does the resource compiler do with the entries and what does the compiled format contain?
Is a difference in loading times created using resource files rather than code?
What's your recommendation on using resources or code?
Any additional information would be appreciated.

Traditionally in Windows development, resource files are embedded directly in the executable binary. I don't think this particular approach to resources is much used outside Windows, but Java has a cross-platform way of doing the same thing by storing resources in a compressed archive along with the executable bits. You can see the same thing in Android development, where resources are embedded in APK files. It's pretty common (except in Windows binaries) to use XML to specify resources.
Packing resources into an executable has the potential advantage of offering a "single-file solution"; the executable contains the application logic and the supporting resources all bundled into one file. This can be helpful in some situations, but it's pretty rare to find a substantial application that can be distributed as a single file. In any event, this method of packing resources goes back to the earliest days of Windows development; Microsoft probably drew on approaches that were common in other desktop micro platforms of that time, all of which, I guess, are now long dead.
Resources allow certain things -- typically user interface elements - to be specified declaratively, rather than in code. A nice feature is to allow these declarative elements to be locale-dependent. So, for example, we can specify a main menu for English, one for French, and so on, and have the proper one picked at run-time.
It's easier to have graphical design tools write and edit declarative resource files than code, although it's not impossible to have them output code directly. In any event, it's usually considered "nicer" to have user interface elements separated from application logic. With careful design, the user interface bits and other resources can be edited after the application itself is finalized, and can be maintained separately.
Classical Windows resources are compiled by a resource compiler into a packed binary format, poked in that format into object (.obj, .o) files, and then linked into the executable by the linker. I suspect that with modern development tools, all that stuff is completely hidden, and you just see the final executable. The Windows APIs know how to unpack resources from the executable, and turn them into programmatic representations -- not code, as such, but data in memory that can be used by other API calls.
In my experience, Windows (binary) resources don't have any significant overhead over explicit coding, although processing an XML-based resource can be a bit slower.
Coding a user interface usually offers more flexibility than using resources; it's not uncommon to see both approaches used in the same application.

Embed and execute native code from memory

I want to do these two things with my application (Windows only):
Allow user to insert (with a tool) a native code into my application before starting it.
Run this user-inserted code straight from memory during runtime.
Ideally, it would have to be easy for user to specify this code.
I have two ideas how to do this that I'm considering right now:
User would embed a native dll into application's resources. Application would load this dll straight from memory using techniques from this article.
Somehow copy assembly code of .dll method specified by user into my application resources, and execute this code from heap as described in this article.
Are there any better options to do this? If not, any thoughts on what might cause problems in those solutions?
EDIT
I specifically do not want to use LoadLibrary* calls as they require dll file to be already on hard drive which I'm trying to avoid. I'm also trying to make dissasembling harder.
EDIT
Some more details:
Application code is under my control and is native. I simply want to provide user with a way to embed his own customized functions after my application is compiled and deployed.
User code can have arbitrary restrictions placed on it by me, it is not a problem.

The aim is to allow third parties to statically link code into a native application.
The obvious way to do this is to supply the third parties with the application's object files and a linker. This could be wrapped up in a tool to make it easy to use.
As ever, the devil is in the detail. In addition to object files, applications contain manifests, resources, etc. You need to find a linker that you are entitled to distribute. You need to use a compiler that is compatible with said linker. And so on. But this is certainly feasible, and likely to be more reliable than trying to roll your own solution.

Your option 2 is pretty much intractable in my view. For small amounts of self-contained code it's viable. For any serious amount of code you cannot realistically hope for success without re-inventing the wheel that is your option 1.
For example, real code is going to link to Win32 functions, and how are you going to resolve those? You'd have to invent something just like a PE import table. So, why do so when DLLs already exist. If you invented your own PE-like file format for this code, how would anyone generate it? All the standard tools are in the business of making PE format DLLs.
As for option 1, loading a DLL from memory is not supported. So you have to do all the work that the loader would do for you if it were loading from file. So, if you want to load a DLL that is not present on the disk, then option 1 is your only choice.
Any half competent hacker will readily pull the DLL from the executing process though so don't kid yourself that running DLLs from memory will somehow protect your code from inspection.

This is something called "application virtualization", there are 3rd party tools for that, check them on google.
In a simple case, you may just load "DLL" into memory, apply relocs, setup imports and call entry point.

Substituting a dll, to monitor dll usage

Let's say i have a console application that writes to a file. If I understand correctly, C++ uses some dll, to create and write to the file.
Is it possible, to create a dll with the same name, having the same function signatures, and forward these calls to the real api? The application would not see any change, and it would be possible to notify, or restrict certain calls.
My worry is - is there any security signature that the applications check in a dll?
Would there be any conflicts with the libary names?

You don't need to create a new DLL to replace the original, nor should you. That would have global repercussions on the entre OS. What you should do instead is have your app use Detours to hook the particular DLL functions you are interested in. That way, you are not modifying any DLLs at all, and the OS can do its normal work, while still allowing your custom code to run and deciding whether to call the original DLL functions or not.

yes, entirely possible you can already figure out what the function signatures are and re-implement them (heh, Google already did this with Java JRE :) )
The problem you have is loading a different dll with the same name, though its entirely possible you can do this explicitly with a fixed directory. you can load the dll and then hook up all its functions.
At least that's what I think will happen - having 2 dlls of the same name in the same process might be troublesome (but I think, different path, all's ok).
Security generally isn't done when loading dlls, however MS does this with some .NET assemblies, but the cost is that it takes a long time to load them as there's a significant delay caused by the decryption required to secure the dll. this is why a lot of .NET applications (especially those that use dlls installed in the GAC) are perceived as slow to start - there can be a significant amount of security checking occurring.
I think, generally, if someone has enough access to your computer to install a dll, he could do a lot worse. A skilled hacker woudl just replace the original dll with a new one that does all of the above - and then you wouldn't be able to see a new, rogue dll lying around your system.
If you are security-conscious and worried about this kind of think, the correct way to resolve it is with an intrusion-detection system like AIDE. This scans your computer and builds a database of all the files present, with a secure hash of each. You then re-scan at regular intervals and compare the results with the original DB: any changes will be obvious and can be flagged for investigation or ignored as legitimate changes. Many Linux servers do this regularly as part of their security hardening. For more info, go to ServerFault.

Why creating DLLs instead of compiling everything to a one big executable?

I saw and done myself a lot of small products where a same piece of software is separated into one executable and several DLLs, and those DLLs are not just shared libraries done by somebody else, but libraries which are done exclusively for this software, by the same developer team. (I'm not talking here about big scale products which just require hundreds of DLLs and share them extensively with other products.)
I understand that separating code into several parts, each one compiling into a separate DLL, is good from the point of view of a developer. It means that:
If a developer changes one project, he has to recompile only this one, and dependent ones, which can be much faster.
A project can be done by a single developer in a team, while other developers will just use provided interfaces, without stepping into the code.
Auto updates of the software may sometimes be faster, with lower server impact.
But what about the end user? Isn't it just bad to deliver a piece of software composed of one EXE & several DLLs, when everything could be grouped together? After all:
The user may not even understand what are those files and why they fill memory on his hard disk,
The user may want to move a program, for example save it on an USB flash drive. Having one big executable makes things easier,
Most anti-virus software will check each DLL. Checking one executable will be much faster than the smaller executable and dozens of libraries.
Using DLLs makes some things slower (for example, in .NET Framework, a "good" library must be found and checked if it is signed),
What happens if a DLL is removed or replaced by a bad version? Does every program handle this? Or does it crash without even explaining what's wrong with it?
Having one big executable has some other advantages.
So isn't it better from end users point of view, for small/medium size programs, to deliver one big executable? If so, why there are no tools allowing to do it easily (for example a magic tool integrated in common IDEs which compiles the whole solution into one executable, not each time, of course, but on-demand or during deployment).
This is someway similar to putting all CSS or all JavaScript files into one big file for the user. Having several files is much smarter for the developer and easier to maintain, but linking each page of a website to two files instead of dozens optimizes performance. In the same manner, CSS sprites are awful for the designers, because they require much more work, but are better from users point of view.

It's a tradeoff
(You figured that out yourself already ;))
For most projects, the customer doesn't care about how many files get installed, but he cares how many features are completed in time. Making life easier for developers benefits the user, too.
Some more reasons for DLL's
Some libraries don't play well together in the same build, but can be made to behave in a DLL (e.g. one DLL may use WTL3, the other requires WTL8).
Some of the DLL's may contain components to be loaded into other executables (global hooks, shell extensions, browser addons).
Some of the DLL's might be 3rd party, only available as DLL.
There may be reuse within the company - even if you see only one "public" product, it might be used in a dozen of internal projects using that DLL.
Some of the DLL's might have been built with a different environment thats not available for all developers in the company.
Standalone EXE vs. Installed product
Many products won't work as standalone executable anyway. They require installation, and the user not touching things he's not supposed to touch. Having one or more binaries doesn't matter.
Build Time Impact
Maybe you underestimate the impact of build times, and maintaining a stable build for large projects. If a build takes even 5 minutes, you could ephemistically call that "make developers think ahead, instead of tinker until it seems to work ok". But it's a serious time eater, and creates a serious distraction.
Build time of a single project is hard to improve. Working on VC9, build parallelization within one project is shaky, as is the incremental linker. Link times are especially hard to "optimize away" by faster machines.
Developer Independence
Another thing you might underestimate.
To use a DLL, you need a .dll and a .h.
To compile and link source code, you usually need to set up include directories, output directories, install 3rd party libraries, etc. It's a pain, really.

Yes, it is better IMHO - and I always use static linking for exactly the reasons you give, wherever possible. Lots of the reasons that dynamic linkage was invented for (saving memory, for example) no longer really apply. OTOH, there are architectural reasons, for example plugin architectures, why dynamic linking may be preferable to static.

I think your general point about considering carefully the final packaging of deliverables is well made. In the case of JavaScript such packaging is indeed possible, and with compression makes a significant difference.

Done lots of projects, never met an end-user which has any problem with some dll files residing on his box.
As a developer I would say yes it could matter. As an end-user who cares...

Yes, it may often be better from the end user's point of view. However, the benefits to the developer (and the development process) that you mention often mean that a business will prefer the cost-effective option.
It's a feature that too few users will appreciate, and that will cost a non-trivial amount to deliver.
Remember that we on StackOverflow are "above average" users. How many (non-geek) family members and friends do you have that would really value the ability to install their software to a USB stick?

The big advantages for dll are linked to the introduction of borders and independance .
For example in C/C++ only symbols exported are visible. Imagine a module A with a global variable "scale" and a module B with another global variable "scale" if you put all together you go to desaster ; in this case a dll may help you.
You can distribute those dll as component for customers without exactly the same compiler / linker options ; and this is often a good way to do cross language interop.

Comparing cold-start to warm start

Our application takes significantly more time to launch after a reboot (cold start) than if it was already opened once (warm start).
Most (if not all) the difference seems to come from loading DLLs, when the DLLs' are in cached memory pages they load much faster. We tried using ClearMem to simulate rebooting (since its much less time consuming than actually rebooting) and got mixed results, on some machines it seemed to simulate a reboot very consistently and in some not.
To sum up my questions are:
Have you experienced differences in launch time between cold and warm starts?
How have you delt with such differences?
Do you know of a way to dependably simulate a reboot?
Edit:
Clarifications for comments:
The application is mostly native C++ with some .NET (the first .NET assembly that's loaded pays for the CLR).
We're looking to improve load time, obviously we did our share of profiling and improved the hotspots in our code.
Something I forgot to mention was that we got some improvement by re-basing all our binaries so the loader doesn't have to do it at load time.

As for simulating reboots, have you considered running your app from a virtual PC? Using virtualization you can conveniently replicate a set of conditions over and over again.
I would also consider some type of profiling app to spot the bit of code causing the time lag, and then making the judgement call about how much of that code is really necessary, or if it could be achieved in a different way.

It would be hard to truly simulate a reboot in software. When you reboot, all devices in your machine get their reset bit asserted, which should cause all memory system-wide to be lost.
In a modern machine you've got memory and caches everywhere: there's the VM subsystem which is storing pages of memory for the program, then you've got the OS caching the contents of files in memory, then you've got the on-disk buffer of sectors on the harddrive itself. You can probably get the OS caches to be reset, but the on-disk buffer on the drive? I don't know of a way.

How did you profile your code? Not all profiling methods are equal and some find hotspots better than others. Are you loading lots of files? If so, disk fragmentation and seek time might come into play.
Maybe even sticking basic timing information into the code, writing out to a log file and examining the files on cold/warm start will help identify where the app is spending time.
Without more information, I would lean towards filesystem/disk cache as the likely difference between the two environments. If that's the case, then you either need to spend less time loading files upfront, or find faster ways to load files.
Example: if you are loading lots of binary data files, speed up loading by combining them into a single file, then do a slerp of the whole file into memory in one read and parse their contents. Less disk seeks and time spend reading off of disk. Again, maybe that doesn't apply.
I don't know offhand of any tools to clear the disk/filesystem cache, but you could write a quick application to read a bunch of unrelated files off of disk to cause the filesystem/disk cache to be loaded with different info.

#Morten Christiansen said:
One way to make apps start cold-start faster (sort of) is used by e.g. Adobe reader, by loading some of the files on startup, thereby hiding the cold start from the users. This is only usable if the program is not supposed to start up immediately.
That makes the customer pay for initializing our app at every boot even when it isn't used, I really don't like that option (neither does Raymond).

One succesful way to speed up application startup is to switch DLLs to delay-load. This is a low-cost change (some fiddling with project settings) but can make startup significantly faster. Afterwards, run depends.exe in profiling mode to figure out which DLLs load during startup anyway, and revert the delay-load on them. Remember that you may also delay-load most Windows DLLs you need.

A very effective technique for improving application cold launch time is optimizing function link ordering.
The Visual Studio linker lets you pass in a file lists all the functions in the module being linked (or just some of them - it doesn't have to be all of them), and the linker will place those functions next to each other in memory.
When your application is starting up, there are typically calls to init functions throughout your application. Many of these calls will be to a page that isn't in memory yet, resulting in a page fault and a disk seek. That's where slow startup comes from.
Optimizing your application so all these functions are together can be a big win.
Check out Profile Guided Optimization in Visual Studio 2005 or later. One of the thing sthat PGO does for you is function link ordering.
It's a bit difficult to work into a build process, because with PGO you need to link, run your application, and then re-link with the output from the profile run. This means your build process needs to have a runtime environment and deal cleaning up after bad builds and all that, but the payoff is typically 10+ or more faster cold launch with no code changes.
There's some more info on PGO here:
http://msdn.microsoft.com/en-us/library/e7k32f4k.aspx

As an alternative to function order list, just group the code that will be called within the same sections:
#pragma code_seg(".startUp")
//...
#pragma code_seg
#pragma data_seg(".startUp")
//...
#pragma data_seg
It should be easy to maintain as your code changes, but has the same benefit as the function order list.
I am not sure whether function order list can specify global variables as well, but use this #pragma data_seg would simply work.

One way to make apps start cold-start faster (sort of) is used by e.g. Adobe reader, by loading some of the files on startup, thereby hiding the cold start from the users. This is only usable if the program is not supposed to start up immediately.
Another note, is that .NET 3.5SP1 supposedly has much improved cold-start speed, though how much, I cannot say.

It could be the NICs (LAN Cards) and that your app depends on certain other
services that require the network to come up. So profiling your application alone may not quite tell you this, but you should examine the dependencies for your application.

If your application is not very complicated, you can just copy all the executables to another directory, it should be similar to a reboot. (Cut and Paste seems not work, Windows is smart enough to know the files move to another folder is cached in the memory)

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio