Related
All texts on how to create a compiler stop after explaining lexers and parsers. They don't explain how to create the machine code. I want to understand the end-to-end process.
Currently what I understand is that, the Windows exe file formats are called Portable Executable. I read about the headers it has and am yet to find a resource which explains this easily.
My next issue is, I don't see any resource which explains how machine code is stored in the file. Is it like 32-bit fixed length instructions stored one after another in the .text section?
Is there any place which at least explains how to create an exe file which does nothing (it has a No Op instruction). My next step then would be linking to dll files to print to console.
Nice question! I don't have much expertise on this specific question, but this is how I would start:
PE or ELF does not create pure machine code. It also contains some header info etc. Read more: Writing custom data to executable files in Windows and Linux
I assume you are looking for how does ELF/PE file hold the machine code, you can get that from this question (using objdump): How do you extract only contents of an ELF section
Now, if you want to know how the content part is generated in the first place, i.e. how is the machine code generated, then that's the task of the compiler's code generation.
Try out some resource editor like ResourceEditor to understand the exe or simply ildasm.
PS: These are mostly Unix solutions, but I am sure, PE should be doing something fundamentally similar.
I think the best way to approach it will be first try to analyze how existing PE/ELFs work, basically reverse engineering. And to do that, Unix machine will be a good point to start. And then do your magic :)
Not same but a similar question here.
Update:
I generated an object dump out of a sample c code. Now, I assume that's what you are targeting right? You need to know do you generate this file (a.out)?
https://gist.github.com/1329947
Take a look at this image, a life time of a c code.
Source
Now, just to be clear, you are looking to implement the final step, i.e. conversion of object code to executable code?
As in many of his articles, I'd say Matt Pietrek's piece about PE internals remains the best introdction to the matter more than a decade after being written.
Iv'e used "Wotsit's File Format" for years... all the way back to the days of MS-Dos :-) and back to when it was just a collection of text files you could download from most BBS systems called "The Game programmers file type encyclopaedia"
It's now owned by the people that run Gamedev.Net, and probably one of the best kept secrets on the internet.
You'll find the EXE format on this page : http://www.wotsit.org/list.asp?fc=5
Enjoy.
UPDATE June 2020 - The link above seems to be now dead, I've found the "EXE" page listed on this web archive page of the wotsit site: https://web.archive.org/web/20121019145432/http://www.wotsit.org/list.asp?al=E
UPDATE 2 - I'm keeping the edit as it was when I added the update erlier, thanks to those who wanted to edit it, but it's for a good reason I'm rejecting it:
1) Wotsit.org may at some point in the future come back online, if you actually try visiting the url, you'll find that it's not gone, it does still respond, it just responds with an error message. This tells me that someone is keeping the domain alive for whatever reason.
2) The archive links do seem to be a bit jittery, some work, some don't, sometimes they seem to work, then after a refresh they don't work, then they do work again. I remember from experience when wotsit was still online, they they had some very strange download/linking detection code in, and this probably caused archive.org to get some very wierd results, I do remember them taking this stance because of the huge number of 3rd party sites trying to cash in on their success, by pretending to be affiliate's and then direct linking to wotsit from an ad infested site.
Until the wotsit domain is removed entirely from the internet and not even the DNS responds, then would be the time to wrap everything up into single archive links, until then, this is the best way to maintain the link.
Not surprisingly the best sites for information about writing PE format files are all about creating viruses.
A search of VX Heavens for "PE" gives a whole bunch of tutorials for modifying PE files
Some information about making PE files as small as possible: Tiny PE.
The minimalistic way to mess around with code generation, if you're just looking to try a few simple things out, is to output MS-DOS .COM files, which have no header or metadata. Sadly, you'd be restricted to 16-bit code. This format is still somewhat popular for demos.
As for the instruction format, from what I recall the x86 instruction set is variable-length, including 1-byte instructions. RISC CPUs would probably have fixed-length instructions.
For Linux, one may read and run the examples from
"Programming from the Ground Up" by Jonathan Bartlett:
http://www.cs.princeton.edu/courses/archive/spr08/cos217/reading/ProgrammingGroundUp-1-0-lettersize.pdf
Then of course one may prefer to hack Windows programs. But perhaps the former
gives a better way to understand what really goes on.
Executable file format is dependent on the OS. For windows it is PE32(32 bit) or PE32+(64 bit).
The way the final executable look like depends on the ABI (application binary interface) of the OS. The ABI tells how the OS loader should load the exe and how it should relocate it, whether it is dll or plain executable etc..
Every object file(executable or dll or driver) contains a part called sections. This is where all of our code, data, jump tables etc.. are situated.
Now, to create an object file, which is what a compiler does, you should not just create the executable machine code, but also the headers, symbol table, relocation records, import/export tables etc..
The pure machine code generation part is completely dependent on how much optimized you want your code to be. But to actually run the code in the PC, you must have to create a file with all of the headers and related data(check MSDN for precise PE32+ format) and then put all of the executable machine code(which your compiler generated) into one of the sections(usually code resides in section called .text). If you have created the file conforming to the PE32+ format, then you have now successfully created an executable in windows.
OK, the Windows dev platform I have is a Windows XP box and a copy of Visual C++ 6.0. I'm trying to create or modify security descriptors for a service. My initial thought from other answers (and some reading) was that I should use ConvertStringSecurityDescriptorToSecurityDescriptor to setup my security descriptor.
Except...my install of VC++ 6.0 lacks the headers for this function (sddl.h according to MSDN).
Can anyone point me to other APIs for creating/modifying Security Descriptors? I'd be happy if I could walk through an existing one (I can QueryServiceObjectSecurity) and just eliminate certain users, but I can't figure out how to do that just looking at MSDN.
Alternately, if someone could point me in the direction of how to call this function without proper headers, that would be fine.
Obvious answer rebuttal: I can (and will) make an attempt to get IT to install a newer version of VC++ on my system, but the last time I asked IT about anything significant it took 7 weeks for them to respond. Since I'd like to get this done in the next week or two, I think IT is not going to fix this question for me in a timely manner.
In theory, you don't need a newer compiler, just an updated SDK. In reality, VC++ 6 is old enough that it may have trouble parsing the headers for a current SDK though.
As an alternative to that, you could declare pointers to the correct types of functions in your code, then use LoadLibrary and GetProcAddress to get the addresses of the correct functions, then call the functions via those pointers.
As an aside, however, I'd point out that I doubt what you've envisioned will work. I've never tried to do exactly what you're trying, so it's always possible I'm wrong, but every time I've done anything manipulating security descriptors, DACLs, SACLs, or anything similar in Windows, the code's ended up considerably longer and more complex than it initially seemed like it should. Even something extremely trivial generally requires at least a couple hundred lines of code...
You could check out the DCOMPerm sample, it has handles the DACL/ACE and other structures you are going to run into - thats where i started when i created a set of classes to handle this for our COM installations - and as #jerry coffin said it ended up being a lot of code.
You'll have to download the SDK to get the sample.
I am trying to help a client with a problem, but I am running out of ideas. They have a custom, written in house application that runs on a schedule, but it crashes. I don't know how long it has been like this, so I don't think I can trace the crashes back to any particular software updates. The most unfortunate part is there is no longer any source code for the VB6 DLL which contains the meat of the logic.
This VB6 DLL is kicked off by 2-3 function calls from a VB Script. Obviously, I can modify the VB Script to add error logging, but I'm not having much luck getting quality information to pinpoint the source of the crash. I have put logging messages on either side of all of the function calls and determined which of the calls is causing the crash. However, nothing is ever returned in the err object because the call is crashing wscript.exe.
I'm not sure if there is anything else I can do. Any ideas?
Edit: The main reason I care, even though I don't have the source code is that there may be some external factor causing the crash (insufficient credentials, locked file, etc). I have checked the log file that is created in drwtsn32.log as a result of wscript.exe crashing, and the only information I get is an "Access Violation".
I first tend to think this is something to do with security permissions, but couldn't this also be a memory access violation?
You may consider using one of the Sysinternals tools if you truly think this is a problem with the environment such as file permissions. I once used Filemon to figure out all the files my application was touching and discovered a problem that way.
You may also want to do a quick sanity check with Dependency Walker to make sure you are actually loading the DLL files you think you are. I have seen the wrong version of the C runtime being loaded and causing a mysterious crash.
Depending on the scope of the application, your client might want to consider a rewrite. Without source code, they will eventually be forced to do so anyway when something else changes.
It's always possible to use a debugger - either directly on the PC that's running the crashing app or on a memory dump - to determine what's happening to a greater or lesser extent. In this case, where the code is VB6, that may not be very helpful because you'll only get useful information at the Win32 level.
Ultimately, if you don't have the source code then will finding out where the bug is really help? You won't be able to fix it anyway unless you can avoid that code path for ever in the calling script.
You could use the debugging tools for windows. Which might help you pinpoint the error, but without the source to fix it, won't do you much good.
A lazier way would be to call the dll from code (not a script) so you can at least see what is causing the issue and inspect the err object. You still won't be able to fix it, unless the problem is that it is being called incorrectly.
The guy of Coding The Wheel has a pretty interesting series about building an online poker bot which is full of serious technical info, a lot of which is concerned with how to get into existing applications and mess with them, which is, in some way, what you want to do.
Specifically, he has an article on using WinDbg to get at important info, one on how to bend function calls to your own code and one on injecting DLLs in other processes. These techniques might help to find and maybe work around or fix the crash, although I guess it's still a tough call.
There are a couple of tools that may be helpful. First, you can use dependency walker to do a runtime profile of your app:
http://www.dependencywalker.com/
There is a profile menu and you probably want to make sure that the follow child processes option is checked. This will do two things. First, it will allow you to see all of the lib versions that get pulled in. This can be helpful for some problems. Second, the runtime profile uses the debug memory manager when it runs the child processes. So, you will be able to see if buffers are getting overrun and a little bit of information about that.
Another useful tool is process monitor from Mark Russinovich:
http://technet.microsoft.com/en-us/sysinternals/bb896645.aspx
This tool will report all file, registry and thread operations. This will help you determine if any you are bumping into file or registry credential issues.
Process explorer gives you a lot of the same information:
http://technet.microsoft.com/en-us/sysinternals/bb896653.aspx
This is also a Russinovich tool. I find that it is a bit easier to look at some data through this tool.
Finally, using debugging tools for windows or dev studio can give you some insight into where the errors are occurring.
Access violation is almost always a memory error - all the more likely in this case because its random crashing (permissions would likely be more obviously reproducible). In the case of a dll it could be either
There's an error in the code in the dll itself - this could be something like a memory allocation error or even a simple loop boundary condition error.
There's an error when the dll tries to link out to another dll on the system. This will generally be caused by a mismatch between dll versions on the machine.
Your first step should be to try and get a reproducible crash condition. If you don't have a set of circumstances that will crash the system then you cannot know when you have fixed it.
I would then install the system on a clean machine and attempt to reproduce the error on that. Run a monitor and check precisely what other files (dlls etc) are open when the program crashes. I have seen code that crashes on a hyperthreaded Pentium but not on an earlier one - so restoring an old machine as a testbed may be a good option to cover that one. Varying the amount of ram in the machine is also worthwhile.
Hopefully these steps might give you a clue. Hopefully it will be an environment problem and so can be avoided by using the right version of windows, dlls etc. However if you're still stuck with the crash at this point with no good clues then your options are either to rewrite or attempt to hunt down the problem further by debugging the dll at assembler lever or dissassembling it. If you are not familiar with assembly code then both of these are long-shots and it's difficult to see what you will gain - and either option is likely to be a massive time-sink. Myself I have in the past, when faced with a particularly low-level high intensity problem like this advertised on one of the 'coder for hire' websites and looked for someone with specialist knowledge. Again you will need a reproducible error to be able to do this.
In the long run a dll without source code will have to be replaced. Paying a specialist with assembly skills to analyse the functions and provide you with flowcharts may well be worthwhile considering. It is good business practice to do this sooner in a controlled manner than later - like after the machine it is running on has crashed and that version of windows is no longer easily available.
You may want to try using Resource Hacker you may have luck de-compiling the in house application. it may not give you the full source code but at least maybe some more info about what the app is doing, which also may help you determine your culrpit.
Add the maximum possible RAM to the machine
This simple and cheap hack has work for me in the past. Of course YMMV.
Reverse engineering is one possibility, although a tough one.
In theory you can decompile and even debug/trace a compiled VB6 application - this is the easy part, modifying it without source, in all but the most simple cases, is the hard part.
Free compilers/decompilers:
VB decompilers
VB debuggers
Rewrite would be, in most cases, a more successful and faster way to solve the problem.
By 'challenging development environment' I don't mean you're on a small boat that's rocking up and down and someone is holding a gun to your head. I mean, are the tools at your disposal making the problem difficult?
Development is typically a cycle of code, run, observe the result, repeat. In some environments this is a very quick and painless process, but in others it's very difficult. We end up using little tricks to help us observe the result and run the code faster.
I was thinking of this because I just started using SSIS (an ETL tool included with SQL Server 2005/8). It's been quite challenging for me to make progress, mainly because there's no guidance on what all the dialogs mean and also because the errors are very cryptic and most of the time don't really tell you what the problem is.
I think the easiest environment I've had to work in was VB6 because there you can edit code while the application is running and it will continue running with your new code! You don't even have to run it again. This can save you a lot of time. Surprisingly, in Netbeans with Java code, you can do the same. It steps out of the method and re-runs the method with the new code.
In SQL Server 2000 when there is an error in a trigger you get no stack trace, which can make it really tricky to locate where the problem occurred since an insert can have a cascading effect and trigger many triggers. In Oracle you get a very nice little stack trace with line numbers so resolving the problem is very easy.
Some of the things that I see really help in locating problems:
Good error messages when things go wrong.
Providing a stack trace when a problem occurs.
Debug environment where you can pause, then output the value of variables and step to follow the execution path.
A graphical debug environment that shows the code as it's running.
A screen that can show the current values of variables so you can print to them.
Ability to turn on debug logging on a production system.
What is the worst you've seen and what can be done to get around these limitations?
EDIT
I really didn't intend for this to be flame bait. I'm really just looking for ideas to improve systems so that if I'm creating something I'll think about these things and not contribute to people's problems. I'm also looking for creative ways around these limitations that I can use if I find myself in this position.
I was working on making modifications to Magento for a client. There is very little information on how the Magento system is organized. There are hundreds of folders and files, and there are at least a thousand view files. There was little support available from Magento forums, and I suspect the main reason for this lack of information is because the creators of Magento want you to pay them to become a certified Magento developer. Also, at that time last year there was no StackOverflow :)
My first task was to figure out how the database schema worked and which table stored some attributes I was looking for. There are over 300 tables in Magento, and I couldn't find out how the SQL queries were being done. So I had just one option...
I exported the entire database (300+ tables, and at least 20,000 lines of SQL code) into a .sql file using PhpMyAdmin, and I 'committed' this file into the subversion repositry. Then, I made some changes to the database using the Magento administration panel, and redownloaded the .sql file. Then, I ran a DIFF using TortioseSvn, and scrolled through the 20k+ lines file to find which lines had changed, LOL. As crazy as it sounds, it did work, and I was able to figure out which tables I needed to access.
My 2nd problem was, because of the crazy directory structure, I had to ftp to about 3 folders at the same time for trivial changes. So I had to keep 3 windows of my ftp program open, switch between them and ftp each time.
The 3rd problem was figuring out how the url mapping worked and where some of the code I wanted was being stored. Here, by sheer luck, I managed to find the Model class I was looking for.
Mostly by sheer luck and other similar crazy adventures I managed to work my way through and complete the project. Since then, StackOverflow was started and by a helpful answer to this bounty question I was able to finally get enough information about Magento that I can do future projects in a less crazy manner (hopefully).
Try keypunching your card deck in Fortran, complete with IBM JCL (Job Control Language), handing it in at the data center window, coming back the next morning and getting an inch-thick stack of printer paper with the hex dump of your crash, and a list of the charges to your account.
Grows hair on your fingernails.
I guess that was an improvement on the prior method of sitting at the console, toggling switches and reading the lights.
Occam on a 400x transputer network. As there was only one transputer that could output to console debugging was a nightmare. Had to build a test harness on a Sun network.
I took a class once, that was loosely based on SICP, except it was taught in Dylan rather than Scheme. Actually, it was in the old Dylan syntax, the prefix one that was based on Scheme. But because there were no interpreters for that old version of Dylan, the professor wrote one. In Java. As an applet. Which meant that it had no access to the filesystem; you had to write all of your code in a separate text editor, and then paste it into the Dylan interpreter. Oh, and it had no debugging facilities, of course. And being a Dylan interpreter written in Java, and this was back in 2000, it was ungodly slow.
Print statement debugging, lots of copying and pasting, and an awful lot of cursing at the interpreter were involved.
Back in the 90's, I was developing applications in Clipper, a compilable dBase-like language. I don't remember if it came with a debugger, we often used a 3rd-party debugger called 'Mr Debug' (really!). Although Clipper was fast, some of our more intensive routines were written in C. If you prayed to the correct gods and uttered the necessary incantations, you could use Microsoft's CodeView debugger to debug the C code. But usually not for more than a few minutes, as the program usually didn't like to spend much time running with CodeView (usually memory problems).
I had a series of makefile switches that I used to stub out the sections of code that I didn't need to debug at the time. My debugging environment was very sparse so there was as much free memory as possible for the program. I also think I drank a lot more back then...
Some years ago I reverse engineered game copy protections. Because the protections was written in C or C++ they were fairly easy to disassemble and understand what was going on. But in some cases it got hairy when the copy protection took a detour into the kernel to obfuscate what was happening. A few of them also started to use of custom made virtual machines to make the problem less understandable. I spent hours writing hooks and debuggers to be able to trace into them. The environment really offered a competetive and innovative mind. I had everything at my disposal save time. Misstakes caused reboots and very little feedback what went wrong. I realized thinking before acting is often a better solution.
Today I dispise debuggers. If the problem is in code visible to me I find it easiest to use verbose logging. (Sometimes the error is not understanding the interface/environment then debuggers are good.) I have also realized time is of an essance. You need to have a good working environment with possibility to test your code instantly. If you compiler takes 15 sec, your environment takes 20 sec to update or your caches takes 5 minutes to clear find another way to test your code. Progress keeps me motivated and without a good working environment I get bored, angry and frustrated.
The last job I had I was a Sitecore Developer. Bugfixing can be very painful if the bug only occurs on the client's system, and they do not have Visual Studio installed on the system, with the remote debugging off, and the problem only happens on the production server (not the staging server).
The worst in recent memory was developing SSRS reports using Dundas controls. We were doing quite a bit with the grids which required coding. The pain was the bugginess of the controls, and the lack of debugging support.
I never got around the limitations, but just worked through them.
A friend of mine downloaded some malware from Facebook, and I'm curious to see what it does without infecting myself. I know that you can't really decompile an .exe, but can I at least view it in Assembly or attach a debugger?
Edit to say it is not a .NET executable, no CLI header.
With a debugger you can step through the program assembly interactively.
With a disassembler, you can view the program assembly in more detail.
With a decompiler, you can turn a program back into partial source code, assuming you know what it was written in (which you can find out with free tools such as PEiD - if the program is packed, you'll have to unpack it first OR Detect-it-Easy if you can't find PEiD anywhere. DIE has a strong developer community on github currently).
Debuggers:
OllyDbg, free, a fine 32-bit debugger, for which you can find numerous user-made plugins and scripts to make it all the more useful.
WinDbg, free, a quite capable debugger by Microsoft. WinDbg is especially useful for looking at the Windows internals, since it knows more about the data structures than other debuggers.
SoftICE, SICE to friends. Commercial and development stopped in 2006. SoftICE is kind of a hardcore tool that runs beneath the operating system (and halts the whole system when invoked). SoftICE is still used by many professionals, although might be hard to obtain and might not work on some hardware (or software - namely, it will not work on Vista or NVIDIA gfx cards).
Disassemblers:
IDA Pro(commercial) - top of the line disassembler/debugger. Used by most professionals, like malware analysts etc. Costs quite a few bucks though (there exists free version, but it is quite quite limited)
W32Dasm(free) - a bit dated but gets the job done. I believe W32Dasm is abandonware these days, and there are numerous user-created hacks to add some very useful functionality. You'll have to look around to find the best version.
Decompilers:
Visual Basic: VB Decompiler, commercial, produces somewhat identifiable bytecode.
Delphi: DeDe, free, produces good quality source code.
C: HexRays, commercial, a plugin for IDA Pro by the same company. Produces great results but costs a big buck, and won't be sold to just anyone (or so I hear).
.NET(C#): dotPeek, free, decompiles .NET 1.0-4.5 assemblies to C#. Support for .dll, .exe, .zip, .vsix, .nupkg, and .winmd files.
Some related tools that might come handy in whatever it is you're doing are resource editors such as ResourceHacker (free) and a good hex editor such as Hex Workshop (commercial).
Additionally, if you are doing malware analysis (or use SICE), I wholeheartedly suggest running everything inside a virtual machine, namely VMware Workstation. In the case of SICE, it will protect your actual system from BSODs, and in the case of malware, it will protect your actual system from the target program. You can read about malware analysis with VMware here.
Personally, I roll with Olly, WinDbg & W32Dasm, and some smaller utility tools.
Also, remember that disassembling or even debugging other people's software is usually against the EULA in the very least :)
psoul's excellent post answers to your question so I won't replicate his good work, but I feel it'd help to explain why this is at once a perfectly valid but also terribly silly question. After all, this is a place to learn, right?
Modern computer programs are produced through a series of conversions, starting with the input of a human-readable body of text instructions (called "source code") and ending with a computer-readable body of instructions (called alternatively "binary" or "machine code").
The way that a computer runs a set of machine code instructions is ultimately very simple. Each action a processor can take (e.g., read from memory, add two values) is represented by a numeric code. If I told you that the number 1 meant scream and the number 2 meant giggle, and then held up cards with either 1 or 2 on them expecting you to scream or giggle accordingly, I would be using what is essentially the same system a computer uses to operate.
A binary file is just a set of those codes (usually call "op codes") and the information ("arguments") that the op codes act on.
Now, assembly language is a computer language where each command word in the language represents exactly one op-code on the processor. There is a direct 1:1 translation between an assembly language command and a processor op-code. This is why coding assembly for an x386 processor is different than coding assembly for an ARM processor.
Disassembly is simply this: a program reads through the binary (the machine code), replacing the op-codes with their equivalent assembly language commands, and outputs the result as a text file. It's important to understand this; if your computer can read the binary, then you can read the binary too, either manually with an op-code table in your hand (ick) or through a disassembler.
Disassemblers have some new tricks and all, but it's important to understand that a disassembler is ultimately a search and replace mechanism. Which is why any EULA which forbids it is ultimately blowing hot air. You can't at once permit the computer reading the program data and also forbid the computer reading the program data.
(Don't get me wrong, there have been attempts to do so. They work as well as DRM on song files.)
However, there are caveats to the disassembly approach. Variable names are non-existent; such a thing doesn't exist to your CPU. Library calls are confusing as hell and often require disassembling further binaries. And assembly is hard as hell to read in the best of conditions.
Most professional programmers can't sit and read assembly language without getting a headache. For an amateur it's just not going to happen.
Anyway, this is a somewhat glossed-over explanation, but I hope it helps. Everyone can feel free to correct any misstatements on my part; it's been a while. ;)
Good news. IDA Pro is actually free for its older versions now:
http://www.hex-rays.com/idapro/idadownfreeware.htm
x64dbg is a good and open source debugger that is actively maintained.
Any decent debugger can do this. Try OllyDbg. (edit: which has a great disassembler that even decodes the parameters to WinAPI calls!)
If you are just trying to figure out what a malware does, it might be much easier to run it under something like the free tool Process Monitor which will report whenever it tries to access the filesystem, registry, ports, etc...
Also, using a virtual machine like the free VMWare server is very helpful for this kind of work. You can make a "clean" image, and then just go back to that every time you run the malware.
I'd say in 2019 (and even more so in 2022), Ghidra (https://ghidra-sre.org/) is worth checking out. It's open source (and free), and has phenomenal code analysis capabilities, including the ability to decompile all the way back to fairly readable C code.
Sure, have a look at IDA Pro. They offer an eval version so you can try it out.
You may get some information viewing it in assembly, but I think the easiest thing to do is fire up a virtual machine and see what it does. Make sure you have no open shares or anything like that that it can jump through though ;)
Boomerang may also be worth checking out.
I can't believe nobody said nothing about Immunity Debugger, yet.
Immunity Debugger is a powerful tool to write exploits, analyze malware, and reverse engineer binary files. It was initially based on Ollydbg 1.0 source code, but with names resoution bug fixed. It has a well supported Python API for easy extensibility, so you can write your python scripts to help you out on the analysis.
Also, there's a good one Peter from Corelan team wrote called mona.py, excelent tool btw.
If you want to run the program to see what it does without infecting your computer, use with a virtual machine like VMWare or Microsoft VPC, or a program that can sandbox the program like SandboxIE
You can use dotPeek, very good for decompile exe file. It is free.
https://www.jetbrains.com/decompiler/
What you want is a type of software called a "Disassembler".
Quick google yields this: Link
If you have no time, submit the malware to cwsandbox:
http://www.cwsandbox.org/
http://jon.oberheide.org/blog/2008/01/15/detecting-and-evading-cwsandbox/
HTH
The explorer suite can do what you want.