How resource files in Windows work and why use them? - windows

I'm getting frustrated at every tutorial/book using resource files but never explaining why they use them or how they work under the hood. I haven't been able to find any information whatsoever on this. I'm hoping someone could create a public answer for everyone to find later.
Some relevant information may include...
What is the rationale behind resource files?
Are they a feature or Windows or language compilers?
Why should you not just create a GUI via code only?
Are there situations where only a resource file can be used or only code can be used?
How do resource file entries get converted in to actual window's objects at runtime?
What exactly does the resource compiler do with the entries and what does the compiled format contain?
Is a difference in loading times created using resource files rather than code?
What's your recommendation on using resources or code?
Any additional information would be appreciated.

Traditionally in Windows development, resource files are embedded directly in the executable binary. I don't think this particular approach to resources is much used outside Windows, but Java has a cross-platform way of doing the same thing by storing resources in a compressed archive along with the executable bits. You can see the same thing in Android development, where resources are embedded in APK files. It's pretty common (except in Windows binaries) to use XML to specify resources.
Packing resources into an executable has the potential advantage of offering a "single-file solution"; the executable contains the application logic and the supporting resources all bundled into one file. This can be helpful in some situations, but it's pretty rare to find a substantial application that can be distributed as a single file. In any event, this method of packing resources goes back to the earliest days of Windows development; Microsoft probably drew on approaches that were common in other desktop micro platforms of that time, all of which, I guess, are now long dead.
Resources allow certain things -- typically user interface elements - to be specified declaratively, rather than in code. A nice feature is to allow these declarative elements to be locale-dependent. So, for example, we can specify a main menu for English, one for French, and so on, and have the proper one picked at run-time.
It's easier to have graphical design tools write and edit declarative resource files than code, although it's not impossible to have them output code directly. In any event, it's usually considered "nicer" to have user interface elements separated from application logic. With careful design, the user interface bits and other resources can be edited after the application itself is finalized, and can be maintained separately.
Classical Windows resources are compiled by a resource compiler into a packed binary format, poked in that format into object (.obj, .o) files, and then linked into the executable by the linker. I suspect that with modern development tools, all that stuff is completely hidden, and you just see the final executable. The Windows APIs know how to unpack resources from the executable, and turn them into programmatic representations -- not code, as such, but data in memory that can be used by other API calls.
In my experience, Windows (binary) resources don't have any significant overhead over explicit coding, although processing an XML-based resource can be a bit slower.
Coding a user interface usually offers more flexibility than using resources; it's not uncommon to see both approaches used in the same application.

Related

Embed and execute native code from memory

I want to do these two things with my application (Windows only):
Allow user to insert (with a tool) a native code into my application before starting it.
Run this user-inserted code straight from memory during runtime.
Ideally, it would have to be easy for user to specify this code.
I have two ideas how to do this that I'm considering right now:
User would embed a native dll into application's resources. Application would load this dll straight from memory using techniques from this article.
Somehow copy assembly code of .dll method specified by user into my application resources, and execute this code from heap as described in this article.
Are there any better options to do this? If not, any thoughts on what might cause problems in those solutions?
EDIT
I specifically do not want to use LoadLibrary* calls as they require dll file to be already on hard drive which I'm trying to avoid. I'm also trying to make dissasembling harder.
EDIT
Some more details:
Application code is under my control and is native. I simply want to provide user with a way to embed his own customized functions after my application is compiled and deployed.
User code can have arbitrary restrictions placed on it by me, it is not a problem.
The aim is to allow third parties to statically link code into a native application.
The obvious way to do this is to supply the third parties with the application's object files and a linker. This could be wrapped up in a tool to make it easy to use.
As ever, the devil is in the detail. In addition to object files, applications contain manifests, resources, etc. You need to find a linker that you are entitled to distribute. You need to use a compiler that is compatible with said linker. And so on. But this is certainly feasible, and likely to be more reliable than trying to roll your own solution.
Your option 2 is pretty much intractable in my view. For small amounts of self-contained code it's viable. For any serious amount of code you cannot realistically hope for success without re-inventing the wheel that is your option 1.
For example, real code is going to link to Win32 functions, and how are you going to resolve those? You'd have to invent something just like a PE import table. So, why do so when DLLs already exist. If you invented your own PE-like file format for this code, how would anyone generate it? All the standard tools are in the business of making PE format DLLs.
As for option 1, loading a DLL from memory is not supported. So you have to do all the work that the loader would do for you if it were loading from file. So, if you want to load a DLL that is not present on the disk, then option 1 is your only choice.
Any half competent hacker will readily pull the DLL from the executing process though so don't kid yourself that running DLLs from memory will somehow protect your code from inspection.
This is something called "application virtualization", there are 3rd party tools for that, check them on google.
In a simple case, you may just load "DLL" into memory, apply relocs, setup imports and call entry point.

Substituting a dll, to monitor dll usage

Let's say i have a console application that writes to a file. If I understand correctly, C++ uses some dll, to create and write to the file.
Is it possible, to create a dll with the same name, having the same function signatures, and forward these calls to the real api? The application would not see any change, and it would be possible to notify, or restrict certain calls.
My worry is - is there any security signature that the applications check in a dll?
Would there be any conflicts with the libary names?
You don't need to create a new DLL to replace the original, nor should you. That would have global repercussions on the entre OS. What you should do instead is have your app use Detours to hook the particular DLL functions you are interested in. That way, you are not modifying any DLLs at all, and the OS can do its normal work, while still allowing your custom code to run and deciding whether to call the original DLL functions or not.
yes, entirely possible you can already figure out what the function signatures are and re-implement them (heh, Google already did this with Java JRE :) )
The problem you have is loading a different dll with the same name, though its entirely possible you can do this explicitly with a fixed directory. you can load the dll and then hook up all its functions.
At least that's what I think will happen - having 2 dlls of the same name in the same process might be troublesome (but I think, different path, all's ok).
Security generally isn't done when loading dlls, however MS does this with some .NET assemblies, but the cost is that it takes a long time to load them as there's a significant delay caused by the decryption required to secure the dll. this is why a lot of .NET applications (especially those that use dlls installed in the GAC) are perceived as slow to start - there can be a significant amount of security checking occurring.
I think, generally, if someone has enough access to your computer to install a dll, he could do a lot worse. A skilled hacker woudl just replace the original dll with a new one that does all of the above - and then you wouldn't be able to see a new, rogue dll lying around your system.
If you are security-conscious and worried about this kind of think, the correct way to resolve it is with an intrusion-detection system like AIDE. This scans your computer and builds a database of all the files present, with a secure hash of each. You then re-scan at regular intervals and compare the results with the original DB: any changes will be obvious and can be flagged for investigation or ignored as legitimate changes. Many Linux servers do this regularly as part of their security hardening. For more info, go to ServerFault.

Why creating DLLs instead of compiling everything to a one big executable?

I saw and done myself a lot of small products where a same piece of software is separated into one executable and several DLLs, and those DLLs are not just shared libraries done by somebody else, but libraries which are done exclusively for this software, by the same developer team. (I'm not talking here about big scale products which just require hundreds of DLLs and share them extensively with other products.)
I understand that separating code into several parts, each one compiling into a separate DLL, is good from the point of view of a developer. It means that:
If a developer changes one project, he has to recompile only this one, and dependent ones, which can be much faster.
A project can be done by a single developer in a team, while other developers will just use provided interfaces, without stepping into the code.
Auto updates of the software may sometimes be faster, with lower server impact.
But what about the end user? Isn't it just bad to deliver a piece of software composed of one EXE & several DLLs, when everything could be grouped together? After all:
The user may not even understand what are those files and why they fill memory on his hard disk,
The user may want to move a program, for example save it on an USB flash drive. Having one big executable makes things easier,
Most anti-virus software will check each DLL. Checking one executable will be much faster than the smaller executable and dozens of libraries.
Using DLLs makes some things slower (for example, in .NET Framework, a "good" library must be found and checked if it is signed),
What happens if a DLL is removed or replaced by a bad version? Does every program handle this? Or does it crash without even explaining what's wrong with it?
Having one big executable has some other advantages.
So isn't it better from end users point of view, for small/medium size programs, to deliver one big executable? If so, why there are no tools allowing to do it easily (for example a magic tool integrated in common IDEs which compiles the whole solution into one executable, not each time, of course, but on-demand or during deployment).
This is someway similar to putting all CSS or all JavaScript files into one big file for the user. Having several files is much smarter for the developer and easier to maintain, but linking each page of a website to two files instead of dozens optimizes performance. In the same manner, CSS sprites are awful for the designers, because they require much more work, but are better from users point of view.
It's a tradeoff
(You figured that out yourself already ;))
For most projects, the customer doesn't care about how many files get installed, but he cares how many features are completed in time. Making life easier for developers benefits the user, too.
Some more reasons for DLL's
Some libraries don't play well together in the same build, but can be made to behave in a DLL (e.g. one DLL may use WTL3, the other requires WTL8).
Some of the DLL's may contain components to be loaded into other executables (global hooks, shell extensions, browser addons).
Some of the DLL's might be 3rd party, only available as DLL.
There may be reuse within the company - even if you see only one "public" product, it might be used in a dozen of internal projects using that DLL.
Some of the DLL's might have been built with a different environment thats not available for all developers in the company.
Standalone EXE vs. Installed product
Many products won't work as standalone executable anyway. They require installation, and the user not touching things he's not supposed to touch. Having one or more binaries doesn't matter.
Build Time Impact
Maybe you underestimate the impact of build times, and maintaining a stable build for large projects. If a build takes even 5 minutes, you could ephemistically call that "make developers think ahead, instead of tinker until it seems to work ok". But it's a serious time eater, and creates a serious distraction.
Build time of a single project is hard to improve. Working on VC9, build parallelization within one project is shaky, as is the incremental linker. Link times are especially hard to "optimize away" by faster machines.
Developer Independence
Another thing you might underestimate.
To use a DLL, you need a .dll and a .h.
To compile and link source code, you usually need to set up include directories, output directories, install 3rd party libraries, etc. It's a pain, really.
Yes, it is better IMHO - and I always use static linking for exactly the reasons you give, wherever possible. Lots of the reasons that dynamic linkage was invented for (saving memory, for example) no longer really apply. OTOH, there are architectural reasons, for example plugin architectures, why dynamic linking may be preferable to static.
I think your general point about considering carefully the final packaging of deliverables is well made. In the case of JavaScript such packaging is indeed possible, and with compression makes a significant difference.
Done lots of projects, never met an end-user which has any problem with some dll files residing on his box.
As a developer I would say yes it could matter. As an end-user who cares...
Yes, it may often be better from the end user's point of view. However, the benefits to the developer (and the development process) that you mention often mean that a business will prefer the cost-effective option.
It's a feature that too few users will appreciate, and that will cost a non-trivial amount to deliver.
Remember that we on StackOverflow are "above average" users. How many (non-geek) family members and friends do you have that would really value the ability to install their software to a USB stick?
The big advantages for dll are linked to the introduction of borders and independance .
For example in C/C++ only symbols exported are visible. Imagine a module A with a global variable "scale" and a module B with another global variable "scale" if you put all together you go to desaster ; in this case a dll may help you.
You can distribute those dll as component for customers without exactly the same compiler / linker options ; and this is often a good way to do cross language interop.

Best way to inject functionality into a binary

What would be the best way of inserting functionality into a binary application (3d party, closed source).
The target application is on OSX and seems to have been compiled using gcc 3+. I can see the listing of functions implemented in the binary and have debugged and isolated one particular function which I would like to remotely call.
Specifically, I would like to call this function - let's call it void zoomByFactor(x,y) - when I receive certain data from a complex HIDevice.
I can easily modify or inject instructions into the binary file itself (ie. the patching does not need to occur only in RAM).
What would you recommend as a way of "nicely" doing this?
Edit:
I do indeed need to entire application. So I can't ditch it and use a library. (For those who need an ethical explanation: this is a proprietary piece of CAD software whose company website hasn't been updated since 2006. I have paid for this product (quite a lot of money for what it is, really) and have project data which I can not easily migrate away from it. The product suits me just fine as it is, but I want to use a new HID which I recently got. I've examined the internals of the application, and I'm fairly confident that I can call the correct function with the relevant data and get it to work properly).
Here's what I've done so far, and it is quite gheto.
I've already modified parts of the application through this process:
xxd -g 0 binary > binary.hex
cat binary.hex | awk 'substitute work' > modified.hex
xxd -r modified.hex > newbinary
chmod 777 newbinary
I'm doing this kind of jumping through hoops because the binary is almost 100 megs large.
The jist of what I'm thinking is that I'd jmp somewhere in the main application loop, launch a thread, and return to the main function.
Now, the questions are: where can I insert the new code? do I need to modify symbol tables? alternatively, how could I make a dylib load automatically so that the only "hacking" I need to do is inserting a call to a normally loaded dylib into the main function?
For those interested in what I've ended up doing, here's a summary:
I've looked at several possibilities. They fall into runtime patching, and static binary file patching.
As far as file patching is concerned, I essentially tried two approaches:
modifying the assembly in the code
segments (__TEXT) of the binary.
modifying the load commands in the
mach header.
The first method requires there to be free space, or methods you can overwrite. It also suffers from extremely poor maintainability. Any new binaries will require hand patching them once again, especially if their source code has even slightly changed.
The second method was to try and add a LC_ LOAD_ DYLIB entry into the mach header. There aren't many mach-o editors out there, so it's hairy, but I actually modified the structures so that my entry was visible by otool -l. However, this didn't actually work as there was a dyld: bad external relocation length at runtime. I'm assuming I need to muck around with import tables etc. And this is way too much effort to get right without an editor.
Second path was to inject code at runtime. There isn't much out there to do this. Even for apps you have control over (ie. a child application you launch). Maybe there's a way to fork() and get the initialization process launched, but I never go that.
There is SIMBL, but this requires your app to be Cocoa because SIMBL will pose as a system wide InputManager and selectively load bundles. I dismissed this because my app was not Cocoa, and besides, I dislike system wide stuff.
Next up was mach_ inject and the mach_star project. There is also a newer project called
PlugSuit hosted at google which seems to be nothing more than a thin wrapper around mach_inject.
Mach_inject provides an API to do what the name implies. I did find a problem in the code though. On 10.5.4, the mmap method in the mach_inject.c file requires there to be a MAP_ SHARED or'd with the MAP_READ or else the mmap will fail.
Aside from that, the whole thing actually works as advertised. I ended up using mach_ inject_ bundle to do what I had intended to do with the static addition of a DYLIB to the mach header: namely launching a new thread on module init that does its dirty business.
Anyways, I've made this a wiki. Feel free to add, correct or update information. There's practically no information available on this kind of work on OSX. The more info, the better.
In MacOS X releases prior to 10.5 you'd do this using an Input Manager extension. Input Manager was intended to handle things like input for non-roman languages, where the extension could popup a window to input the appropriate glyphs and then pass the completed text to the app. The application only needed to make sure it was Unicode-clean, and didn't have to worry about the exact details of every language and region.
Input Manager was wildly abused to patch all sorts of unrelated functionality into applications, and often destabilized the app. It was also becoming an attack vector for trojans, such as "Oompa-Loompa". MacOS 10.5 tightens restrictions on Input Managers: it won't run them in a process owned by root or wheel, nor in a process which has modified its uid. Most significantly, 10.5 won't load an Input Manager into a 64 bit process and has indicated that even 32 bit use is unsupported and will be removed in a future release.
So if you can live with the restrictions, an Input Manager can do what you want. Future MacOS releases will almost certainly introduce another (safer, more limited) way to do this, as the functionality really is needed for language input support.
I believe you could also use the DYLD_INSERT_LIBRARIES method.
This post is also related to what you were trying to do;
I recently took a stab at injection/overriding using the mach_star sources. I ended up writing a tutorial for it since documentation for this stuff is always so sketchy and often out of date.
http://soundly.me/osx-injection-override-tutorial-hello-world/
Interesting problem. If I understand you correctly, you'd like to add the ability to remotely call functions in a running executable.
If you don't really need the whole application, you might be able to strip out the main function and turn it into a library file that you can link against. It'll be up to you to figure out how to make sure all the required initialization occurs.
Another approach could be to act like a virus. Inject a function that handles the remote calls, probably in another thread. You'll need to launch this thread by injecting some code into the main function, or wherever else is appropriate. Most likely you'll run into major issues with initialization, thread safety, and/or maintaining proper program state.
The best option, if its available, is to get the vendor of your application to expose a plugin API that lets you do this cleanly and reliably in a supported manner.
If you go with either hack-the-binary route, it'll be time consuming and brittle, but you'll learn a lot in the process.
On Windows, this is simple to do, is actually very widely done and is known as DLL/code injection.
There is a commercial SDK for OSX which allows doing this: Application Enhancer (free for non-commercial use).

Windows Help files - what are the options?

Back in the old days, Help was not trivial but possible: generate some funky .rtf file with special tags, run it through a compiler, and you got a WinHelp file (.hlp) that actually works really well.
Then, Microsoft decided that WinHelp was not hip and cool anymore and switched to CHM, up to the point they actually axed WinHelp from Vista.
Now, CHM maybe nice, but everyone that tried to open a .chm file on the Network will know the nice "Navigation to the webpage was canceled" screen that is caused by security restrictions.
While there are ways to make CHM work off the network, this is hardly a good choice, because when a user presses the Help Button he wants help and not have to make some funky settings.
Bottom Line: I find CHM absolutely unusable. But with WinHelp not being an option anymore either, I wonder what the alternatives are, especially when it comes to integrate with my Application (i.e. for WinHelp and CHM there are functions that allow you to directly jump to a topic)?
PDF has the disadvantage of requiring the Adobe Reader (or one of the more lightweight ones that not many people use). I could live with that seeing as this is kind of standard nowadays, but can you tell it reliably to jump to a given page/anchor?
HTML files seem to be the best choice, you then just have to deal with different browsers (CSS and stuff).
Edit: I am looking to create my own Help Files. As I am a fan of the "No Setup, Just Extract and Run" Philosophy, i had that problem many times in the past because many of my users will run it off the network, which causes exactly this problem.
So i am looking for a more robust and future-proof way to provide help to my users without having to code a different help system for each application i make.
CHM is a really nice format, but that Security Stuff makes it unusable, as a Help system is supposed to provide help to the user, not to generate even more problems.
HTML would be the next best choice, ONLY IF you would serve them from a public web server. If you tried to bundle it with your app, all the files (and images (and stylesheets (and ...) ) ) would make CHM look like a gift from gods.
That said, when actually bundled in the installation package, (instead of being served over the network), I found the CHM files to work nicely.
OTOH, another pitfall about CHM files: Even if you try to open a CHM file on a local disk, you may bump into the security block if you initially downloaded it from somewhere, because the file could be marked as "came from external source" when it was obtained.
I don't like the html option, and actually moved from plain HTML to CHM by compressing and indexing them. Even use them on a handful of non-Windows customers even.
It simply solved the constant little breakage of people putting it on the network (nesting depth limited, strange locking effects), antivirus that died in directories with 30000 html files, and 20 minutes decompression time while installing on an older system, browser safety zones and features, miscalculations of needed space in the installer etc.
And then I don't even include the people that start "correcting" them, 3rd party product with faulty "integration" attempts etc, complaints about slowliness (browser start-up)
We all had waited years for the problems to go away as OSes and hardware improved, but the problems kept recurring in a bedazzling number of varieties and enough was enough. We found chmlib, and decided we could forever use something based on this as escape with a simple external reader, if the OS provided ones stopped working and switched.
Meanwhile we also have an own compiler, so we are MS free future-proof. That doesn't mean we never will change (solutions with local web-servers seem favourite nowadays), but at least we have a choice.
Our software is both distributed locally to the clients and served from a network share. We opted for generating both a CHM file and a set of HTML files for serving from the network. Users starting the program locally use the CHM file, and users getting their program served from a network share has to use the HTML files.
We use Help and Manual and can thus easily produce both types of output from the same source project. The HTML files also contain searching capabilities and doesn't require a web server, so though it isn't an optimal solution, works fine.
So far all the single-file types for Windows seems broken in one way or another:
WinHelp - obsoleted
HtmlHelp (CHM) - obsoleted on Vista, doesn't work from network share, other than that works really nice
Microsoft Help 2 (HXS) - this seems to work right up until the point when it doesn't, corrupted indexes or similar, this is used by Visual Studio 2005 and above, as an example
If you don't want to use an installer and you don't want the user to perform any extra steps to allow CHM files over the network, why not fall back to WinHelp? Vista does not include WinHlp32.exe out of the box, but it is freely available as a download for both Vista and Server 2008.
It depends on how import the online documentation is to your product, a good documentation infrastructure can be complex to establish but once done it pays off. Here is how we do it -
Help source DITA compilant XML, stored in SCC (ClearCase).
Help editing XMetal
Help compilation, customized Open DITA Toolkit, with custom Perl/Java preprocessing
Help source cross references applications resources at compile time, .RC files etc
Help deliverables from single source, PDF, CHM, Eclipse Help, HTML.
Single source repository produces help for multiple products 10+ with thousands of shared topics.
From what you describe I would look at Eclipse Help, its not simple to integrate into .NET or MFC applications, you basically have to do the help mapping to resolve the request to a URL then fire the URL to Eclipse Help wrapper or a browser.
Is the question how to generate your own help files, or what is the best help file format?
Personally, I find CHM to be excellent. One of the first things I do when setting up a machine is to download the PHP Manual in CHM format (http://www.php.net/download-docs.php) and add a hotkey to it in Crimson Editor. So when I press F1 it loads the CHM and performs a search for the word my cursor is on (great for quick function reference).
If you are doing "just extract and run", you are going to run in security issues. This is especially true if you are users are running Vista (or later). is there a reason why you wanted to avoid packaging your applications inside an installer? Using an installer would alleviate the "external source" problem. You would be able to use .chm files without any problems.
We use InstallAware to create our install packages. It's not cheap, but is very good. If cost is your concern, WIX is open source and pretty robust. WIX does have a learning curve, but it's easy to work with.
PDF has the disadvantage of requiring the Adobe Reader
I use Foxit Reader on Windows at home and at work. A lot smaller and very quick to open. Very handy when you are wondering what exactly a80000326.pdf is and why it is clogging up your documents folder.
I think the solution we're going to end up going with for our application is hosting the help files ourselves. This gives us immediate access to the files and the ability to keep them up to date.
What I plan is to have the content loaded into a huge series of XML files, each one containing help for a specific item. This XML would contain links to other XML files. We would use XSLT to display the contents as necessary.
Depending on the licensing, we may build a client-specific XSLT file in order to tailor the look and feel to what they need. We may need to be able to only show help for particular versions of our product as well and that can be done by filtering out stuff in the XSLT.
I use a commercial package called AuthorIT that can generate a number of different formats, such as chm, html, pdf, word, windows help, xml, xhtml, and some others I have never heard of (does dita ring a bell?).
It is a content management system oriented towards the needs of technical documentation writers.
The advantage is that you can use and re-use the same content to build a set of guides, and then generate them in different formats.
So the bottom line relative to the question of choosing chm or html or whatever is that if you are using this you are not locked into a given format, but you can provide several among which the user can choose, and you can even add more formats as you go along, at no extra cost.
If you just have one guide to create it won't be worth your while, but if you have a documentation set to manage then it is the best to my knowledge. Their support is very helpful also.

Resources