As far as I'm aware, Windows hasn't been decompiled by anyone yet. Obviously it's complicated, but surely it should've been done by now to some degree?
My thinking behind this is that if the end-user has access to the software, and the computer is able to run it, then even an obfuscated version of it must be obtainable?
I'm obviously missing something, I'm just not sure what.
There's nothing preventing Windows from being decompiled (apart from EULA and similar legal bindings, of course). As you noted, the code must run on the CPU at some time, and the CPU must read the code from memory, and you can read from memory too. Some parts of the system can be a bit trickier, since to run the OS you need to give the OS some exclusive priviledges (that's how most modern protected OSes work), but it's nothing that can't be worked around. In any case, there's not a lot of effort to prevent Windows decompilation - that would have barely any benefit, while making debugging, error reporting and similar harder. Microsoft even goes so far to provide a special debug version of Windows that's even more tailored for software development.
The main point is that there's little reason to decompile Windows. What practical use would such a massive effort have? And if you're a corporation that needs access to Windows source codes (for example, when developing embedded solutions), you can get them. Just because Windows isn't open source doesn't mean the sources aren't available.
If you're not someone who needs their own version of Windows (common in the times of Windows CE), there's even less of a reason to decompile Windows. You need to stick to the defined public APIs anyway - that's a good practice regardless of whether the software is open source or not. APIs are contracts - implementation details you'd get through decompiling aren't. They might very well change with the next security hotfix or such. This is especially important given how serious Windows is about compatibility - it's quite rare for an update (or even a new major release) to break compatibility with old software.
So, if you want to decompile Windows, there's nothing technical that's really preventing you from doing so. But you're looking at tens of millions of lines of source code that was compiled by very smart compilers, with bits of handwritten optimised assembly thrown around, tons of compatibility workarounds that might as well be outright obfuscation (remember, you don't get the comments - just the actually compiled code). Are you willing to spend a few hundred thousand hours to satisfy your idle curiosity? :P
Related
What is the best way to unpack PE files? I've seen some tools from 7 years ago, like Quick Unpack. Is there anything more recent? Or is it better to run different tools for different packers since individual unpackers are likely more up-to-date?
There is no one size fits all solution.
The latest "all in one" unpackers are Quick Unpack and GUnpacker but they are both getting rather old.
There is Pe-sieve & mal_unpack from hasherezade, mal_unpack can work in an automated fashion. Mal_unpack is basically just an automated version of Pe-sieve. Pe-bear is another tool from hasherezade which helps you re-align sections after doing a dump, which requires that you know how to unpack manually but it make the process much easier.
There is a new online tool for unpacking malware called unpac.me from the OpenAnalysis team which can unpack just about anything.
Successful malware distributors are not using public crypters as often, heuristics for detecting public packers is too easy, which is why we have seen a reduction in AIO unpacker tools. In addition, Themida and VMProtect are the standard now and as they continue to add more features, they're becoming more difficult to unpack everyday. With the new virtualization features, automated unpacking is becoming almost impossible.
Even though Quick Unpack is old, I would not underestimate its power in present days. As long as you find the best setup, this tool will produce launchable dumps of Exes/Dlls packed by dozens of known packers + even unknown ones!
Make sure you are running an old OS (WinXP - 7) really or virtually, and don't forget that Quick Unpack is a dynamic unpacker (i.e. can hardly be used for malware analysis). Old does not mean bad :)
This concerns packers in their classical definition.
If you are looking for some "generic unpacker" for modern protectors (such as VMProtect, Themida, Enigma, Obsidium, etc.), then I do not think they will ever be made. There are some specific tools (both private and public) which can help you to automate "unpacking" partially, but the majority of work still needs to be done by hands to remove these kinds of protectors. But again, it depends on what you want to see in the end (analysable code, de-obfuscated dump, fully launchable reconstruct, ...).
Does Windows 10 support running older Win32 (MFC, ATL, Visual Basic 6) applications on ARM processors? Does it require some form of emulation or conversion?
There's no x86 Win32 emulation at all. You need to use a toolset designed for the platform.
As with 7/8.1 Windows has leaned further and further into the Net way of doing things. So many of the commandline functions are done through net calls.
Also note that Win10 is pretty much Win NT, it is basically what Win98 should have been, to save us the disasterous influx of virus's on what was an OS with a swing door and no form of protection.
That NT side of things will affect all programmers in time, particularly over the following,
The rights of your users. This is a good thing because we have all been frustrated at our users leaving the doors open for virus and hacking. NT at least helps elliminate a lot of that.
File handling. Win10 is a big step closer to an OS on demand (Which is Microsoft's current target), so we can not assume items that our software makes use of will always be locally present, so we must go through the .NET route ready for when ondemand comes in properly so that the OS will handle the demands for us. Though it does worry me that we currently have no real clues as to how that will be handled if the request can not be full filled.
But also we can not be lazy with file access rights. For example we tend to make assumptions in the user's area about access rights, then get bitten in the bum when we do a scan or search of all directories, only to find DirectoryInfo.GetDirectories is unuseable unless we make sure special folders will not stop it part way through.
Since all directories will in time be special folders, we need to be handling the access rights on the work we do now. More easily done in C++ than C# im my opinion.
So, if you have done it in 'Managed' code then it ought to go anywhere that C# and VB go, call my synical if you like, but I can not help but have doubts about that, I can not really see MS finding it desirable to have on-demand applications and OS on NET but also providing Win32 wrapped in MFC running as an alternative. You may find your code is trapped in a shrinking box.
I don't mean for this question to be a flame bait but I'll be using Microsoft and their win32 API as a example of a legacy API.
Now what I am wondering here is Microsoft is spending a lots of their money and energy in maintaining their legacy API, including all of the "glitches/bugs/workaround" that are needed to keep the API functioning the same. Now I'm aware that in Windows 7 they are providing a way for the customer to run their application in a "Windows XP" VM which would be one such way for them to start cleaning up their win32 API because they could then push all of the application into the "Windows XP" VM.
So now what I am wondering is, is it possible to virtualization a legacy API in such way that an customer/program can still access and use it, yet at the same time be able to take advantage of the newer version/API? Because as far as I understand it, if the application is ran in the "Windows XP" VM, it won't be able to access any of the newer API/feature of Windows 7.
The thing that puzzles me about this question when it comes up is that Windows has been doing this since NT came out in the mid nineties. This is how NT runs DOS and Win16 programs, and how it always has. The NTVDM virtualization layer runs 16-bit apps under Win32 with very little special support from the core OS. This is just one example - another is WINE, which as I understand it does a pretty reasonabl job of running windows apps on top of an API set which is very different from that of windows. So it is definitely possible.
The more pertinent question would be why Microsoft would consider it. In order for you to think it is necessary you have to think two things. 1) There is something better to replace the win32 API with and 2) Maintaining the Win32 API is a burden.
Both of these are questionable. In the case of kernel duties, such as accessing hardware and synchronizing and doing threads and processes and memory the Win32 API does a pretty good job, and is ultimately quite close to what the kernel really does. If you think there is a better API then that must mean there is also a better kernel. I personally don't think that NT needs replacing right now. For graphics and windowing, admitedly gdi32 is a bit long in the tooth. But Microsoft solved that problem by building WPF right alongside it. This then brings in the burden question. Well, sure there are two APIs to maintain, but if you virtualized GDI on top of WPF you'd still have to maintain both anyway so there is no benefit there. The advantage of running both in parallel is that GDI already exists and is already tested. All you have to do is to fix the occasional bug, whereas a new virtualization layer would have to be written and tested all over again, which takes time away from making WPF better.
In terms of maintaining back compat, that isn't as much of a burden as it sounds. It is mainly a test question - you have to test that the API behaviour doesn't change, but again - those tests have already been written, so it isn't really any extra work.
So, to answer a question with a question, why would they bother?
This is an interesting question, at least to me, here are some of my thoughts.
Your understanding is correct, an application running in the XP VM only has access to the Win32 APIs provided by XP in the VM. One of the many ways that I have seen Microsoft's approach to enhancing specific APIs is to create new functions with the enhanced/fixed functionality and name the new function by append Ex and even ExEx to the original name, for example
GetVersion
GetVersionEx
For functions that accept pointers to structures, the structures are 'versioned' by using the size of the structure to determine the functionality required, so older code would be passing a previous size of the structure while newer code would be passing in the newer larger strucure and the API functions accordingly.
I guess, the problem has become that it is no longer just differences in how an API works, but more integral to the functioning of the operating system and the internal structures which have changes significantly enough that arguably badly written code is effectively broken.
As to your actual question, I guess it would be quite tough. Even if one thought to let the OS adjust how it executes code based on a target OS version in the PE header of the executable, what would happen if a newer DLL was loaded into the process that targeted the latest OS, now how should the OS handle this when the code is executing? IMHO, I think this would be very challenging, one frought with pitfalls that would ultimately fail.
Of course that is just my raw thoughts on the topic so I might be 100% wrong and there is some simple approach that just did not come to mind.
A friend of mine downloaded some malware from Facebook, and I'm curious to see what it does without infecting myself. I know that you can't really decompile an .exe, but can I at least view it in Assembly or attach a debugger?
Edit to say it is not a .NET executable, no CLI header.
With a debugger you can step through the program assembly interactively.
With a disassembler, you can view the program assembly in more detail.
With a decompiler, you can turn a program back into partial source code, assuming you know what it was written in (which you can find out with free tools such as PEiD - if the program is packed, you'll have to unpack it first OR Detect-it-Easy if you can't find PEiD anywhere. DIE has a strong developer community on github currently).
Debuggers:
OllyDbg, free, a fine 32-bit debugger, for which you can find numerous user-made plugins and scripts to make it all the more useful.
WinDbg, free, a quite capable debugger by Microsoft. WinDbg is especially useful for looking at the Windows internals, since it knows more about the data structures than other debuggers.
SoftICE, SICE to friends. Commercial and development stopped in 2006. SoftICE is kind of a hardcore tool that runs beneath the operating system (and halts the whole system when invoked). SoftICE is still used by many professionals, although might be hard to obtain and might not work on some hardware (or software - namely, it will not work on Vista or NVIDIA gfx cards).
Disassemblers:
IDA Pro(commercial) - top of the line disassembler/debugger. Used by most professionals, like malware analysts etc. Costs quite a few bucks though (there exists free version, but it is quite quite limited)
W32Dasm(free) - a bit dated but gets the job done. I believe W32Dasm is abandonware these days, and there are numerous user-created hacks to add some very useful functionality. You'll have to look around to find the best version.
Decompilers:
Visual Basic: VB Decompiler, commercial, produces somewhat identifiable bytecode.
Delphi: DeDe, free, produces good quality source code.
C: HexRays, commercial, a plugin for IDA Pro by the same company. Produces great results but costs a big buck, and won't be sold to just anyone (or so I hear).
.NET(C#): dotPeek, free, decompiles .NET 1.0-4.5 assemblies to C#. Support for .dll, .exe, .zip, .vsix, .nupkg, and .winmd files.
Some related tools that might come handy in whatever it is you're doing are resource editors such as ResourceHacker (free) and a good hex editor such as Hex Workshop (commercial).
Additionally, if you are doing malware analysis (or use SICE), I wholeheartedly suggest running everything inside a virtual machine, namely VMware Workstation. In the case of SICE, it will protect your actual system from BSODs, and in the case of malware, it will protect your actual system from the target program. You can read about malware analysis with VMware here.
Personally, I roll with Olly, WinDbg & W32Dasm, and some smaller utility tools.
Also, remember that disassembling or even debugging other people's software is usually against the EULA in the very least :)
psoul's excellent post answers to your question so I won't replicate his good work, but I feel it'd help to explain why this is at once a perfectly valid but also terribly silly question. After all, this is a place to learn, right?
Modern computer programs are produced through a series of conversions, starting with the input of a human-readable body of text instructions (called "source code") and ending with a computer-readable body of instructions (called alternatively "binary" or "machine code").
The way that a computer runs a set of machine code instructions is ultimately very simple. Each action a processor can take (e.g., read from memory, add two values) is represented by a numeric code. If I told you that the number 1 meant scream and the number 2 meant giggle, and then held up cards with either 1 or 2 on them expecting you to scream or giggle accordingly, I would be using what is essentially the same system a computer uses to operate.
A binary file is just a set of those codes (usually call "op codes") and the information ("arguments") that the op codes act on.
Now, assembly language is a computer language where each command word in the language represents exactly one op-code on the processor. There is a direct 1:1 translation between an assembly language command and a processor op-code. This is why coding assembly for an x386 processor is different than coding assembly for an ARM processor.
Disassembly is simply this: a program reads through the binary (the machine code), replacing the op-codes with their equivalent assembly language commands, and outputs the result as a text file. It's important to understand this; if your computer can read the binary, then you can read the binary too, either manually with an op-code table in your hand (ick) or through a disassembler.
Disassemblers have some new tricks and all, but it's important to understand that a disassembler is ultimately a search and replace mechanism. Which is why any EULA which forbids it is ultimately blowing hot air. You can't at once permit the computer reading the program data and also forbid the computer reading the program data.
(Don't get me wrong, there have been attempts to do so. They work as well as DRM on song files.)
However, there are caveats to the disassembly approach. Variable names are non-existent; such a thing doesn't exist to your CPU. Library calls are confusing as hell and often require disassembling further binaries. And assembly is hard as hell to read in the best of conditions.
Most professional programmers can't sit and read assembly language without getting a headache. For an amateur it's just not going to happen.
Anyway, this is a somewhat glossed-over explanation, but I hope it helps. Everyone can feel free to correct any misstatements on my part; it's been a while. ;)
Good news. IDA Pro is actually free for its older versions now:
http://www.hex-rays.com/idapro/idadownfreeware.htm
x64dbg is a good and open source debugger that is actively maintained.
Any decent debugger can do this. Try OllyDbg. (edit: which has a great disassembler that even decodes the parameters to WinAPI calls!)
If you are just trying to figure out what a malware does, it might be much easier to run it under something like the free tool Process Monitor which will report whenever it tries to access the filesystem, registry, ports, etc...
Also, using a virtual machine like the free VMWare server is very helpful for this kind of work. You can make a "clean" image, and then just go back to that every time you run the malware.
I'd say in 2019 (and even more so in 2022), Ghidra (https://ghidra-sre.org/) is worth checking out. It's open source (and free), and has phenomenal code analysis capabilities, including the ability to decompile all the way back to fairly readable C code.
Sure, have a look at IDA Pro. They offer an eval version so you can try it out.
You may get some information viewing it in assembly, but I think the easiest thing to do is fire up a virtual machine and see what it does. Make sure you have no open shares or anything like that that it can jump through though ;)
Boomerang may also be worth checking out.
I can't believe nobody said nothing about Immunity Debugger, yet.
Immunity Debugger is a powerful tool to write exploits, analyze malware, and reverse engineer binary files. It was initially based on Ollydbg 1.0 source code, but with names resoution bug fixed. It has a well supported Python API for easy extensibility, so you can write your python scripts to help you out on the analysis.
Also, there's a good one Peter from Corelan team wrote called mona.py, excelent tool btw.
If you want to run the program to see what it does without infecting your computer, use with a virtual machine like VMWare or Microsoft VPC, or a program that can sandbox the program like SandboxIE
You can use dotPeek, very good for decompile exe file. It is free.
https://www.jetbrains.com/decompiler/
What you want is a type of software called a "Disassembler".
Quick google yields this: Link
If you have no time, submit the malware to cwsandbox:
http://www.cwsandbox.org/
http://jon.oberheide.org/blog/2008/01/15/detecting-and-evading-cwsandbox/
HTH
The explorer suite can do what you want.
The Free MS Windows replacement operating system ReactOS has just released a new version. They have a large and active development team.
Have you tried your software with it yet?
if so what is your recommendation?
Is it time to start investigating it as a serious Windows replacement?
Targeting ReactOS specifically is a bit too narrow IMO -- perhaps a better focus is to target compatibility with WINE. Because ReactOS shares so many of its usermode DLLs with WINE, targeting WINE should result in the app running just fine on ReactOS.
Of course, there will always be things that WINE can't emulate well (hence the need for ReactOS). In this way, it seems that if something runs in WINE, it will run in ReactOS, whereas the fact that something runs in ReactOS doesn't mean that it will necessarily run in WINE.
Targeting WINE is well documented, perhaps easier to test, and by definition, should make your app compatible with ReactOS as a matter of course. In this way, you're not only gathering the rather large user base of current WINE users, but you're future-proofing yourself for whenever anyone wants to use your app with ReactOS.
In their homepage, at the Tour you can see a partial list of office, tools and games that already run OK (or more or less) at ReactOS. If you subscribe to the newsletter, you'll receive info about much more - for instance, I was quite surprised when I read most SQL Server 2000 tools actually work on ReactOS!! Query Analyzer, OSQL and Books Online work fine, Enterprise Manager and Profiler are buggy and the DBMS won't work at all.
At a former workplace (an all MS shop) we investigated seriously into it as a way to reduce our expenditure in licenses whilst keeping our in-house developed apps. Since it couldn't run MSDE fine, we had to abandon the project - hope in the future this will be solved and my ex-coworkers can push it again.
These announcements might as well be also on their homepage - I couldn't find them after 5 mins. of searching, though. Probably the easiest way to know all these compatibility issues is to join the newsletter, or look for its archives.
I have been tracking this OS' progress for quite some time. I believe it has all the potential to really bring an OSS operating system to the masses for it breaks the "chicken and egg" problem: it has applications and drivers from the very beginning (since it aims to have full ABI compatibility with MS Windows).
Just wait for their first beta, I won't be surprised if they surpass Linux in popularity really soon after that...
Post Edit: Found it! Look at section Support Database, it's the web place to go look for whether a particular piece of hardware of some program works on ReactOS.
ReactOS has been under development for a long long time.
They were in some hot water earlier because some of their code appeared to be line by line dissasembly of some NT kernel code, I think they have replaced all of it.
I wouldn't bother with cross platform testing until they hit the same market penetration as Linux, which I would wager is never.
Until ReactOS doesn't randomly crash just sitting there within 5 minutes of booting, I won't worry about testing my code on it. Don't get me wrong, I like ReactOS, but it's just not stable enough for any meaningful testing yet!
No, I do not think it is time to start thinking of it as a Windows replacement.
As the site states, it's still in the Alpha stages. More importantly, whos Windows replacement? Yours? Your users? The former is one thing, the latter is categorically a no-go.
As an aside, I'm not really sure who this OS is targetting. It has to be people who rely on Windows software but don't want to pay, because people who simply don't want Windows can use MacOS / Linux, and the support (community or otherwise) for these choices is good.
Moreover, if you use Linux you already have some amounts of Windows software support via Wine.
Back to people who rely on Windows software but don't want to pay. If they are home users they can just simply pirate it, if they are large business users they already have support contracts and trained people etc. It's hard enough for large businesses to be OK to update to new versions of Windows, let alone an open source replacement.
So I suppose that leaves small businesses who don't want to obtain illegal copies of MS software, can't afford the OS licences and rely on software that only runs on Windows and has bad of non-existent Wine compatibility.
It is a useful replacement for Windows when it runs 'your' software without crashing. At the moment it is not a general purpose os as it is too unstable (being only alpha) but people have used ReactOS successfully in anger for specific tasks already. As a windows replacement it has multiple potential uses, sandbox systems, test and development systems, multiple virtual instances, embedded devices, even packaging/bundling legacy apps with their own compatible o/s. Driver and application compatibility, freed from Microsoft's policy of planned obsolescence and regular GUI renewal, what's not to like?