Creating a PE file using IMetaDataEmit::Save(/ToMemory/ToStream) - winapi

I'm writing a native CLR Profiler which does some heavy IL rewriting. When developing a new feature, we sometimes encounter a CLR verification error. For small methods, it's pretty easy to compare the bytes before and after, looking at the various elements (method header, signature, locals, code and exception table, mostly) and finding the error. Sometimes, this can be on huge methods and the process might take a while. I'm trying to dump the current module to a file, in order to easily run peverify.exe (and https://github.com/dotnet/corert/tree/master/src/ILVerify).
I found IMetaDataEmit::Save, which looks perfect on paper (we're using IMetaDataEmit constantly to perform the IL rewriting). I can dump my module, open it in a hex viewer and see the changes I made. However, it only dumps the module (the .Net directory inside the PE). How can I create a full PE (dll/exe) from this module, preferably programmatically?

Related

Embed and execute native code from memory

I want to do these two things with my application (Windows only):
Allow user to insert (with a tool) a native code into my application before starting it.
Run this user-inserted code straight from memory during runtime.
Ideally, it would have to be easy for user to specify this code.
I have two ideas how to do this that I'm considering right now:
User would embed a native dll into application's resources. Application would load this dll straight from memory using techniques from this article.
Somehow copy assembly code of .dll method specified by user into my application resources, and execute this code from heap as described in this article.
Are there any better options to do this? If not, any thoughts on what might cause problems in those solutions?
EDIT
I specifically do not want to use LoadLibrary* calls as they require dll file to be already on hard drive which I'm trying to avoid. I'm also trying to make dissasembling harder.
EDIT
Some more details:
Application code is under my control and is native. I simply want to provide user with a way to embed his own customized functions after my application is compiled and deployed.
User code can have arbitrary restrictions placed on it by me, it is not a problem.
The aim is to allow third parties to statically link code into a native application.
The obvious way to do this is to supply the third parties with the application's object files and a linker. This could be wrapped up in a tool to make it easy to use.
As ever, the devil is in the detail. In addition to object files, applications contain manifests, resources, etc. You need to find a linker that you are entitled to distribute. You need to use a compiler that is compatible with said linker. And so on. But this is certainly feasible, and likely to be more reliable than trying to roll your own solution.
Your option 2 is pretty much intractable in my view. For small amounts of self-contained code it's viable. For any serious amount of code you cannot realistically hope for success without re-inventing the wheel that is your option 1.
For example, real code is going to link to Win32 functions, and how are you going to resolve those? You'd have to invent something just like a PE import table. So, why do so when DLLs already exist. If you invented your own PE-like file format for this code, how would anyone generate it? All the standard tools are in the business of making PE format DLLs.
As for option 1, loading a DLL from memory is not supported. So you have to do all the work that the loader would do for you if it were loading from file. So, if you want to load a DLL that is not present on the disk, then option 1 is your only choice.
Any half competent hacker will readily pull the DLL from the executing process though so don't kid yourself that running DLLs from memory will somehow protect your code from inspection.
This is something called "application virtualization", there are 3rd party tools for that, check them on google.
In a simple case, you may just load "DLL" into memory, apply relocs, setup imports and call entry point.

How can I create an executable .exe PE file manually?

All texts on how to create a compiler stop after explaining lexers and parsers. They don't explain how to create the machine code. I want to understand the end-to-end process.
Currently what I understand is that, the Windows exe file formats are called Portable Executable. I read about the headers it has and am yet to find a resource which explains this easily.
My next issue is, I don't see any resource which explains how machine code is stored in the file. Is it like 32-bit fixed length instructions stored one after another in the .text section?
Is there any place which at least explains how to create an exe file which does nothing (it has a No Op instruction). My next step then would be linking to dll files to print to console.
Nice question! I don't have much expertise on this specific question, but this is how I would start:
PE or ELF does not create pure machine code. It also contains some header info etc. Read more: Writing custom data to executable files in Windows and Linux
I assume you are looking for how does ELF/PE file hold the machine code, you can get that from this question (using objdump): How do you extract only contents of an ELF section
Now, if you want to know how the content part is generated in the first place, i.e. how is the machine code generated, then that's the task of the compiler's code generation.
Try out some resource editor like ResourceEditor to understand the exe or simply ildasm.
PS: These are mostly Unix solutions, but I am sure, PE should be doing something fundamentally similar.
I think the best way to approach it will be first try to analyze how existing PE/ELFs work, basically reverse engineering. And to do that, Unix machine will be a good point to start. And then do your magic :)
Not same but a similar question here.
Update:
I generated an object dump out of a sample c code. Now, I assume that's what you are targeting right? You need to know do you generate this file (a.out)?
https://gist.github.com/1329947
Take a look at this image, a life time of a c code.
Source
Now, just to be clear, you are looking to implement the final step, i.e. conversion of object code to executable code?
As in many of his articles, I'd say Matt Pietrek's piece about PE internals remains the best introdction to the matter more than a decade after being written.
Iv'e used "Wotsit's File Format" for years... all the way back to the days of MS-Dos :-) and back to when it was just a collection of text files you could download from most BBS systems called "The Game programmers file type encyclopaedia"
It's now owned by the people that run Gamedev.Net, and probably one of the best kept secrets on the internet.
You'll find the EXE format on this page : http://www.wotsit.org/list.asp?fc=5
Enjoy.
UPDATE June 2020 - The link above seems to be now dead, I've found the "EXE" page listed on this web archive page of the wotsit site: https://web.archive.org/web/20121019145432/http://www.wotsit.org/list.asp?al=E
UPDATE 2 - I'm keeping the edit as it was when I added the update erlier, thanks to those who wanted to edit it, but it's for a good reason I'm rejecting it:
1) Wotsit.org may at some point in the future come back online, if you actually try visiting the url, you'll find that it's not gone, it does still respond, it just responds with an error message. This tells me that someone is keeping the domain alive for whatever reason.
2) The archive links do seem to be a bit jittery, some work, some don't, sometimes they seem to work, then after a refresh they don't work, then they do work again. I remember from experience when wotsit was still online, they they had some very strange download/linking detection code in, and this probably caused archive.org to get some very wierd results, I do remember them taking this stance because of the huge number of 3rd party sites trying to cash in on their success, by pretending to be affiliate's and then direct linking to wotsit from an ad infested site.
Until the wotsit domain is removed entirely from the internet and not even the DNS responds, then would be the time to wrap everything up into single archive links, until then, this is the best way to maintain the link.
Not surprisingly the best sites for information about writing PE format files are all about creating viruses.
A search of VX Heavens for "PE" gives a whole bunch of tutorials for modifying PE files
Some information about making PE files as small as possible: Tiny PE.
The minimalistic way to mess around with code generation, if you're just looking to try a few simple things out, is to output MS-DOS .COM files, which have no header or metadata. Sadly, you'd be restricted to 16-bit code. This format is still somewhat popular for demos.
As for the instruction format, from what I recall the x86 instruction set is variable-length, including 1-byte instructions. RISC CPUs would probably have fixed-length instructions.
For Linux, one may read and run the examples from
"Programming from the Ground Up" by Jonathan Bartlett:
http://www.cs.princeton.edu/courses/archive/spr08/cos217/reading/ProgrammingGroundUp-1-0-lettersize.pdf
Then of course one may prefer to hack Windows programs. But perhaps the former
gives a better way to understand what really goes on.
Executable file format is dependent on the OS. For windows it is PE32(32 bit) or PE32+(64 bit).
The way the final executable look like depends on the ABI (application binary interface) of the OS. The ABI tells how the OS loader should load the exe and how it should relocate it, whether it is dll or plain executable etc..
Every object file(executable or dll or driver) contains a part called sections. This is where all of our code, data, jump tables etc.. are situated.
Now, to create an object file, which is what a compiler does, you should not just create the executable machine code, but also the headers, symbol table, relocation records, import/export tables etc..
The pure machine code generation part is completely dependent on how much optimized you want your code to be. But to actually run the code in the PC, you must have to create a file with all of the headers and related data(check MSDN for precise PE32+ format) and then put all of the executable machine code(which your compiler generated) into one of the sections(usually code resides in section called .text). If you have created the file conforming to the PE32+ format, then you have now successfully created an executable in windows.

Is creating a memory dump at customer environment good?

I am facing a severe problem with my program, which gets reproduced only in the customer place. Putting logs, are not helping as I doubt the failure is happening in a third party dll. For some reasons, I couldn't get help from the library provider. I am thinking of producing a dump at the point of failure, so that to analyze it offline. Is this a recommended practice? Or any alternatives?
Yes, this is something that every program should have and utilize as often as possible.
I suggest that you don't use third party libraries. Create your own dumps instead. It's very simple and straight forward. You basically need to do the following:
Your program needs to access dbghelp.dll. It's a windows dll that allows you to create human readable call stacks etc. The debugger uses this dll to display data in your process. It also handles post mortem debugging, i.e. dumps of some sort. This dll can safely be distributed with your software. I suggest that you download and install Debugging Tools for Windows. This will give you access to all sorts of tools and the best tool WinDbg.exe and the latest dbghelp.dll is also in that distribution.
In dbghelp.dll you call e.g. MiniDumpWriteDump(), which will create the dump file and that's more or less it. You're done. As soon as you have that file in your hands, you can start using it. Either in the Visual Studio Debugger, which probably even might be associated with the .dmp file extension, or in WinDbg.
Now, there are a few things to think of while you're at it. When checking dump files like this, you need to generate .pdb files when you compile and link your executable. Otherwise there's no chance of mapping the dump data to human readable data, e.g. to get good callstacks and values of variables etc. This also means that you have to save these .pdb files. You need to be able to match them exactly against that very release. Since the dump files are date stamped with the date stamp of the executable, the debugger needs the exact pdb files. It doesn't matter if your code hasn't changed a single bit, if the .pdb files belong to another compilation session, you're toast.
I encourage every windows win32 developer to check out Oleg Starodumov's site DebugInfo.com. It contains a lot of samples and tutorials and how you can configure and tune your dump file generation. There are of course a myriad of ways to exclude certain data, create your custom debug message to attach to the dump etc.
Keep in mind that minidumps will contain very limited information about the application state at exception time. The trade off is a small file (around 50-100 kB depending on your settings). But if you want, you can create a full dump, which will contain the state of the whole application, i.e. globals and even kernel objects. These files can be HUGE and should only be used at extreme cases.
If there are legal aspects, just make sure your customers are aware of what you're doing. I bet you already have some contract where you aren't supposed to reveal business secrets or other legal aspects. If customers complain, convince them how important it is to find bugs and that this will improve the quality of the software drastically. More or less higher quality at the cost of nothing. If it doesn't cost them anything, that's also a good argument :)
Finally, here's another great site if you want to read up more on crash dump analysis: dumpanalysis.org
Hope this helps. Please comment if you want me to explain more.
Cheers !
Edit:
Just wanted to add that MiniDumpWriteDump() requires that you have a pointer to a MINIDUMP-EXCEPTION-INFORMATION (with underscores) struct. But the GetExceptionInformation() macro provides this for you at time of exception in your exception handler (structured exception handling or SEH):
__try {
}
__except (YourHandlerFunction(GetExceptionInformation())) {
}
YourHandlerFunction() will be the one taking care of generating the minidump (or some other function down the call chain). Also, if you have custom errors in your program, e.g. something happens that should not happen but technically is not an exception, you can use RaiseException() to create your own.
GetExceptionInformation() can only be used in this context and nowhere else during program execution.
Crash dumps are a pretty common troubleshooting method and can be very effective, especially for problems that only reproduce at the customer's site.
Just make sure the customer/client understands what you're doing and that you have permission. It's possible that a crash dump can have sensitive information that a customer may not want (or be permitted) to let walk out the door or over the wire.
Better than that there are libraries that will upload crash data back you.
BugDump and BugSplat
And there's the Microsoft way:
http://msdn.microsoft.com/en-us/library/aa936273.aspx
Disclaimer: I am not a lawyer, nor do I pretend to be one, this is not legal advice.
The data you can include in logs and crash dumps also depend on what domain you are working in. For example, medical equipment and patient information systems often contain sensitive data about patients that should not be visible to unauthorized persons.
The HIPAA Privacy Rule regulates
the use and disclosure of certain
information held by "covered entities"
(...) It establishes regulations for
the use and disclosure of Protected
Health Information (PHI). PHI is any
information held by a covered entity
which concerns health status,
provision of health care, or payment
for health care that can be linked to
an individual.[10] This is interpreted
rather broadly and includes any part
of an individual's medical record or
payment history. --Wikipedia
It should not be possible to link health information to an individual. The crash dumps and logs should be anonymized and stripped of any sensitive information, or not sent at all.
Maybe this does not apply to your specific case, so this is more of a general note. I think it applies to other domains that handle sensitive information, such as military and financial, and so on.
Basically the easiest way to produce a dumpfile is by using adplus. You don't need to change your code.
Adplus is part of the debugging tools for windows, as mentioned in the article above.
Adplus is basically a huge vbscript automation of windbg.
What you have to do to use adplus:
Download and install Debugging tools for windows to c:\debuggers
start your application
open a commandline and navigate to c:\debuggers
run this line "adplus -crash your_exe.exe"
reproduce the crash
you'll get a minidump with all the information you need.
you can open the crash dump in your favorite debugger.
within windbg, the command "analyze -v" helped me in at least 40% of all the crashes that only happened at customer site and were not reproducible in house.

Remote debugging in VB6

Is it possible to remotely debug a process started outside VB6?
The application is a VB6 application with quite a few dll/ocx resources. I am attempting to setup a ClickOnce deployment, using Registration-Free COM, of the VB6 app but have been getting errors when it executes.
My understanding of the way that VB6 redirects COM registerations will probably mean that this is not possible but I thought someone might have a better idea.
To support Darryl's answer suggesting Windbg - here's a 2006 blog post by a Microsoft guy about using Windbg with VB6, and 2004 blog post by another Microsoft guy with a brief introduction to Windbg.
EDIT: Just to make it totally clear. Windbg is a free standalone debugger from Microsoft. Compile your VB6 EXEs, DLLs and OCXs into native code with symbols (create PDB files) and you will be able to debug your ClickOnce application.
Key excerpt from the blog:
If you have limited access to the server machine then you can use the
remote debugging facilities of WinDbg. Attach a copy of WinDbg to the
process in the usual way and then turn it in to a debugging server
(check out .server in the WinDbg help). You can then connect to it
remotely from the File menu of WinDbg. It will be just like being
there except for the lack of noise from the server room fans. When
debugging a remote, your copy of WinDbg is just a very smart terminal
so all extensions, symbols and so on have to be on the remote server.
You set this up the exact same way for any DLL, VB6 or .NET.
The symbols for your component will not load until your component does
and so you have to let the server run at least that long. You can put
a break in early in your VB code if you want to stop the debugger at
that point but if you do, remember that it will stop there every time
through the code. Let’s assume that you let it run and then break in.
If you list the loaded symbols for your module with "x MyModule!*"
then you will see all of your functions together with a lot of symbols
bundled in there for you. VB adds interfaces and symbols quite
unashamedly but you don’t generally need to worry about those. One
thing that will probably look strange is that all class/method syntax
with the C++ double colon convention instead of the friendly little
dot. WinDbg doesn’t understand that VB is different and it is treated
just like any DLL with symbols.
From here, you can set breakpoints in the usual way (bp etc) and step
through code. You can also open up VB source code modules and set
breakpoints in them with F9 although the VB file extensions are not in
the source file type dropdown. Stepping through the code is revealing
but might be a little alarming if you have not seen the code that VB
generates for you before. You will be stepping through the assembler
and there is a lot of COM goo in there. Hresults get checked a lot.
You will probably need to refer to the source often to work out where
you are since it takes a bit of practice to be able to know what the
source code looked like. Variants are especially challenging because
VB does a lot of work for you there and what looks like a simple
equation can result in a great deal of code. Optimised code is even
harder because the order of execution is often very different from
what you might expect and it is harder than usual to see the data.
Data is not easy to get at this way. When you look at local variables
(dv is the command) then you may see that variables are simply listed
as eclipsed which means that the memory is being used for something
else as well within the function lifetime or that the name is not
unique in this context. Enums just show as integers or longs and
objects show as pointers. In fact, they always were exactly that but
the VB IDE hides that from you. VB strings are COM BSTRs (and
accordingly Unicode) under the covers and byte arrays are really char
arrays. You might be surprised to discover that VB strings are Unicode
as VB appears to have no support for anything but ANSI. That is
because the Ruby forms engine was ANSI only. The runtime converts the
Unicode strings to ANSI for Ruby and API calls although there are ways
to pass Unicode if you want.
You are not going to be able to get at the Err, App or Printer objects
since you would need to go through a lot of internal and completely
undocumented structures to get at them. Even if you could get there,
they would just be raw data without the accessor functions that you
use in VB. If you need to look at any of those fields, your best bet
is to embed debug code in the source code to copy their values to
somewhere that you can get at.
You can step in to the VB runtime if you want but it probably won’t be
very revealing if you are trying to debug your application. If you do,
you will notice that VB’s internals are very COM influenced. The
influence was actually two way since some COM ideas came from VB
originally.
You may see exceptions when running your code. Null reference
exceptions (i.e dereferencing a null pointer) are not uncommon or
anything to worry about. They will show up as first chance C000005
exceptions with a 0 or almost 0 address. The runtime will sometimes do
that if there are objects set to nothing but that is safe because the
only possible values are null or a valid value. You will also see
exceptions if your code does lookups in collections and the value is
not there. Because exceptions are now so expensive, you probably want
to avoid doing that if you can. Another exception that you will
commonly see is c000008f. If you look the number up then you will find
that it is a floating point inexact result exception. It is used in a
different meaning here – since we don’t generate real floating point
inexact result exceptions, they can safely be thrown to indicate VB
errors of the normal trappable type.
Debugging hangs and crashes in VB components is done very much in the
same way as with any other unmanaged component but it is just a little
harder because of the compilations described above. If you have to try
debugging VB code this way, I would strongly recommend that you start
on a "Hello world" application and work your way up. All the things
that may VB an easy language to code in make it a terrible language to
debug.
I believe that when debugging in VB6, it does not attach to a running binary but instead interprets the code within it's own process. This is why the Task Manager and Win32 APIs show VB6.exe as the running app when debugging.
Also as you say, VB6 sometimes short-circuits calls to COM libraries so intercepting these calls is not always possible.
You're probably going to have to resort to intelligent logging (i.e. log the values of variables around the points where the errors you are getting occur in the hope of locating the line of code it occurs on, and/or the state of relevant variables.)
Good luck
Have you tried windbg? Just make sure you have pdb files for the project.

Best way to inject functionality into a binary

What would be the best way of inserting functionality into a binary application (3d party, closed source).
The target application is on OSX and seems to have been compiled using gcc 3+. I can see the listing of functions implemented in the binary and have debugged and isolated one particular function which I would like to remotely call.
Specifically, I would like to call this function - let's call it void zoomByFactor(x,y) - when I receive certain data from a complex HIDevice.
I can easily modify or inject instructions into the binary file itself (ie. the patching does not need to occur only in RAM).
What would you recommend as a way of "nicely" doing this?
Edit:
I do indeed need to entire application. So I can't ditch it and use a library. (For those who need an ethical explanation: this is a proprietary piece of CAD software whose company website hasn't been updated since 2006. I have paid for this product (quite a lot of money for what it is, really) and have project data which I can not easily migrate away from it. The product suits me just fine as it is, but I want to use a new HID which I recently got. I've examined the internals of the application, and I'm fairly confident that I can call the correct function with the relevant data and get it to work properly).
Here's what I've done so far, and it is quite gheto.
I've already modified parts of the application through this process:
xxd -g 0 binary > binary.hex
cat binary.hex | awk 'substitute work' > modified.hex
xxd -r modified.hex > newbinary
chmod 777 newbinary
I'm doing this kind of jumping through hoops because the binary is almost 100 megs large.
The jist of what I'm thinking is that I'd jmp somewhere in the main application loop, launch a thread, and return to the main function.
Now, the questions are: where can I insert the new code? do I need to modify symbol tables? alternatively, how could I make a dylib load automatically so that the only "hacking" I need to do is inserting a call to a normally loaded dylib into the main function?
For those interested in what I've ended up doing, here's a summary:
I've looked at several possibilities. They fall into runtime patching, and static binary file patching.
As far as file patching is concerned, I essentially tried two approaches:
modifying the assembly in the code
segments (__TEXT) of the binary.
modifying the load commands in the
mach header.
The first method requires there to be free space, or methods you can overwrite. It also suffers from extremely poor maintainability. Any new binaries will require hand patching them once again, especially if their source code has even slightly changed.
The second method was to try and add a LC_ LOAD_ DYLIB entry into the mach header. There aren't many mach-o editors out there, so it's hairy, but I actually modified the structures so that my entry was visible by otool -l. However, this didn't actually work as there was a dyld: bad external relocation length at runtime. I'm assuming I need to muck around with import tables etc. And this is way too much effort to get right without an editor.
Second path was to inject code at runtime. There isn't much out there to do this. Even for apps you have control over (ie. a child application you launch). Maybe there's a way to fork() and get the initialization process launched, but I never go that.
There is SIMBL, but this requires your app to be Cocoa because SIMBL will pose as a system wide InputManager and selectively load bundles. I dismissed this because my app was not Cocoa, and besides, I dislike system wide stuff.
Next up was mach_ inject and the mach_star project. There is also a newer project called
PlugSuit hosted at google which seems to be nothing more than a thin wrapper around mach_inject.
Mach_inject provides an API to do what the name implies. I did find a problem in the code though. On 10.5.4, the mmap method in the mach_inject.c file requires there to be a MAP_ SHARED or'd with the MAP_READ or else the mmap will fail.
Aside from that, the whole thing actually works as advertised. I ended up using mach_ inject_ bundle to do what I had intended to do with the static addition of a DYLIB to the mach header: namely launching a new thread on module init that does its dirty business.
Anyways, I've made this a wiki. Feel free to add, correct or update information. There's practically no information available on this kind of work on OSX. The more info, the better.
In MacOS X releases prior to 10.5 you'd do this using an Input Manager extension. Input Manager was intended to handle things like input for non-roman languages, where the extension could popup a window to input the appropriate glyphs and then pass the completed text to the app. The application only needed to make sure it was Unicode-clean, and didn't have to worry about the exact details of every language and region.
Input Manager was wildly abused to patch all sorts of unrelated functionality into applications, and often destabilized the app. It was also becoming an attack vector for trojans, such as "Oompa-Loompa". MacOS 10.5 tightens restrictions on Input Managers: it won't run them in a process owned by root or wheel, nor in a process which has modified its uid. Most significantly, 10.5 won't load an Input Manager into a 64 bit process and has indicated that even 32 bit use is unsupported and will be removed in a future release.
So if you can live with the restrictions, an Input Manager can do what you want. Future MacOS releases will almost certainly introduce another (safer, more limited) way to do this, as the functionality really is needed for language input support.
I believe you could also use the DYLD_INSERT_LIBRARIES method.
This post is also related to what you were trying to do;
I recently took a stab at injection/overriding using the mach_star sources. I ended up writing a tutorial for it since documentation for this stuff is always so sketchy and often out of date.
http://soundly.me/osx-injection-override-tutorial-hello-world/
Interesting problem. If I understand you correctly, you'd like to add the ability to remotely call functions in a running executable.
If you don't really need the whole application, you might be able to strip out the main function and turn it into a library file that you can link against. It'll be up to you to figure out how to make sure all the required initialization occurs.
Another approach could be to act like a virus. Inject a function that handles the remote calls, probably in another thread. You'll need to launch this thread by injecting some code into the main function, or wherever else is appropriate. Most likely you'll run into major issues with initialization, thread safety, and/or maintaining proper program state.
The best option, if its available, is to get the vendor of your application to expose a plugin API that lets you do this cleanly and reliably in a supported manner.
If you go with either hack-the-binary route, it'll be time consuming and brittle, but you'll learn a lot in the process.
On Windows, this is simple to do, is actually very widely done and is known as DLL/code injection.
There is a commercial SDK for OSX which allows doing this: Application Enhancer (free for non-commercial use).

Resources