Decompiling a 1990 DOS application - debugging

I have some crucial data written decades ago by an ancient 16bit DOS application.
There are no docs, no source, and no information about the author. Just the 16 bit exe.
I guess it's time for me to learn how to decompile stuff, since it seems the only way to restore file format.
I've tried OllyDbg, it looks really great, but it can't 16 bit.
So, is there a disassembler/debugger capable of working with such executables?
Thanks.
UPD: I know DOSbox, the app runs in it all right. The problem is, I don't need to run it, I need to understand the file format in which it writes data. Or maybe I don't know something about DOSbox and it can run as a debugger/decompiler as well? Or do you mean starting some old 16bit DOS debugger/decompiler in DOSbox? The latter sounds like an idea, but could you please name a decent DOS debugger, then?

disassembling tool:
use IDA Freeware https://www.hex-rays.com/products/ida/support/download_freeware.shtml
you won't find any better tool for reversing - even for old dos programs :)
most other tools are only capable of doing disassembling for 32bit and don't reach in any way the analyze features of IDA - its the gold standard tool of reverse engineering
debugger:
dosbox got its own builtin debugger (reachable through the "debug" command on command line)
but you need to build your own version of dosbox with activated debugger (oder better heavy-debug) see: http://www.vogons.org/viewtopic.php?t=3944
or if you got a ida licence with the sdk there is an dosbox<->ida-debugger plugin available
currently linux only https://github.com/wjp/idados
file format:
do you know what the file contains (what do you want from the file)
very complex information or "just" some lists of values?
maybe its better to start here with an hex-editor (http://mh-nexus.de/de/hxd/) and known result-values to compare
what program uses the data currently (or only the program itself)? maybe its possible to understand how the data is read in this program?
program itself:
how large is the exe?
console program or a big super gfx power app?
real 16bit or 32bit with dos-extender?
a single exe or overlays(dos-dlls)?
can you give access to the executable?
your turn

You're looking for IDA. It's the de-facto disassembler for pretty much anything.
You can get more help on this at https://reverseengineering.stackexchange.com/

You do not necessarily need to disassemble a program in order to figure out the format in which it writes data.
Perhaps you can do differential analysis on it. Change some inputs to the program, have it write the data, and watch how the file changes.
I have some vintage hardware devices here which can dump their NVRAM settings over MIDI in a binary format (in one case, a single SysEx message with a binary blob in it). If I wanted to know what the format is, I'd make small, systematic changes to the settings, and perform dumps, then see what bits in the binary data are changing.
You really are probably best off attacking the data, rather than the program.

Dosbox is probably a thing to try.
You might also look at http://hte.sourceforge.net/ .

Related

How to watch which instructions execute in a macOS or Windows binary?

I'm trying to reverse engineer some functionality from a decently large binary ~31MB. I was wondering what the best way to "watch" instructions of an executable are. Specifically, I want to run the executable, turn on "watch", trigger a feature in the executable, and then be able to see which regions of the executable binary were run. Then with those addresses I'd like to look at the disassembled instructions.
I'm aware that a lot of the instructions will probably be UI code as the GUI seems to be written in QT, but hopefully it will help me narrow down which part of the binary I need to focus on.
I would prefer a tool that works under Windows, but macOS would work also, as I have a version of the binary for both of those systems.
I'm aware of time-reversible debuggers, but I'm unsure if those would be possible to use with such a large binary.

Can I convert a 16-bit .exe program to a 64-bit .exe?

I realize that there will likely be no special converter programs or anything easy like that for such a task, but it imperative that I find some way to get a 16-bit program to run in 64-bit Windows. Due to the large amount of resources that must be dedicated to them, emulators will not be a good solution.
The idea I had for this project was to decompile all the code from a 16-bit program, copy it, and re-compile it into 64-bit code. Is this at all possible using Eclipse or another programming environment?
Basically, I want to make a 16-bit program run in 64-bit Windows without emulators. I realize that it's a tall order, but is it conceivable?
The problem goes beyond translating 16-bit instructions with 64-bit instructions. There is also the ABI (Application Binary Interface) used by the program to communicate with the rest of the system. A 16-bit program likely uses a lot of DOS calls and it's not unlikely it tries to access hardware directly too. There is no way this can be translated automatically. Even if such a solution existed, I highly doubt the result would be more efficient than running in a virtual machine (which actually is very efficient). Further more, programs written for 16-bit environment are often not very scalable, and completely unable to handle amounts of data beyond the capacities of the original target platform.
So I'd say there are really just two realistic solutions: Run it in a virtual machine. Or if that doesn't cut it, write a new application from scratch that does the same thing.
Even if this is a very old question, but I thought I'd write this solution for anyone still looking out there:
Using something like winevdm -which is a very light windows program- you can run 16-bit Windows (Windows 1.x, 2.x, 3.0, 3.1, etc.) on 64-bit Windows apps very easily!

Suggestions on how to write a debug format conversion tool

I'm looking to write a tool that aims to convert debug symbols of one format to another format that's compatible for use under GDB. This seems like a tedious and potentially complex project so I'm not exactly sure how to tackling it.
Intially I'm aiming to convert the Turbo Debug Symbol table(TDS) emitted from borland compilers into something like stabs or dwarf format(seems like dwarf is prefer from my research). But ideally I want to design my tool to be easy enough to extend so it could convert other formats too later on. e.g. codeview4 or maybe even pdb.
My primary motivation for creating this are:
Interoperability. If I can convert a foreign debug format into a form gdb can work with then source-level debugging would be possible on binaries compiled from another compiler other than gcc. This means any frontend debugging interface that uses gdb as a backend will work as well.
No other tools exist. I did a google searching around for similar tools and the closest I've found is tds2dbg. But it doesn't quite do what I'm looking for.
What I have to work with at the moment:
I already have a debug hook API that can understand the TDS debug format. I can use that to help me get at the needed information from the source format I'm converting from.
For the scope of this project, I'm mainly interested in getting this to work under the win32 environment. Other platforms and tools I'm not really concerned about.
The target dwarf debug format I'm converting to. This one I'm really not familiar with at all. I have used gcc ported compilers like MinGW before and debugged them with gdb with the dwarf format. But I don't have any idea how this format is implemented on windows.
The last point is the one I'm concerned about. I'm reading through the dwarf spec documentation but I find I'm having trouble really understanding and comprehending how it works. There's so much detail in there but at the same time it doesn't have any details about how dwarf gets implemented on object files and image files on a platform that doesn't use ELF natively -- namely the PE-COFF format that windows uses. The documentation is also a very dry read, long sentences make it hard to understand and diagrams and illustrations are sparse. I came across an API called libDwarf that should take most of the parsing work out of interpreting dwarf. The problem is I'm still trying to get it to build and I don't know yet how it will work out.
I haven't written any code yet since I don't fully understand what it is I need to build. I have a feeling the biggest hurtle will be figuring out how to work with dwarf due to it's complexity. Googling for information on how dwarf works under windows hasn't turned up anything helpful either. Like for example, there's no information about the 'glue' code that's needed to contain dwarf within a PE executable image file. How are the dwarf sections exactly layed out? Are there any header information for each section? GDB clearly doesn't just take a 'raw' dwarf debug file and use it as is. So what kind of format does gdb expect the debug file to be in for it to be able to work with it?
My question is, how can I start on such a project? More importantly, where can I turn to for help when I inevitably get stuck on a problem?
Affinic Assembler for Windows
Affinic Assembler is an x86/x86-64 assembler for Windows that takes GAS-syntax assembly source with DWARF debug information and generates corresponding CodeView format sections in object file in order to make the linked program debuggable in Visual Studio. This program is good for Cygwin and MinGW users to port Linux code to Windows.
http://www.affinic.com/?page_id=48
You are asking several questions here :-)
I think you are heading in the right direction, using libdwarf.
BUT, have you taken a look at objcopy to see if this tool can do some of the work for you? It probably doesn't support borland, pdb or codeview4, but it might be worth looking into. (Another approach may be to extend objcopy to support the formats you are trying to convert between.)
I have used the dwarf-discuss mailing list sometimes when I have become stuck.
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org
As for the questions on dwarf, split them into separate questions and I will do my best to
answer them. :-)

Scripting language that can produce a small, independent, Windows EXE?

I'd like to do some light data processing - a little binary data manipulation followed by conversion to text serialization. The result is written to a file, and processed by an external program (run by my program). The data processing is more than I'd care to consider doing in batch files.
I'd prefer to use a scripting language, but not have to install the language first. The target computers are mostly older Windows boxes, which are disconnected from the network (no updates, such as PowerShell)
I'm not familiar with the various language's tools for creating EXE files. Which ones have solutions that work well and don't produce huge files? (i.e., whole interpreter package plus my script.)
For my money (its free) AutoIt 3 is exactly what your looking for. AutoIt produces relatively (250k is the standard overhead) small stand alone exes. It has a full perl like regex engine so your light data processing should be a breeze (I've written some pretty heavy data processing scripts in it myself). When downloading autoit be sure to get the full version including Scite this makes compile to exe a one click operation.
I know I might get flamed for this, but VB 6 is a viable option. Since XP SP2 (I think, possibly earlier), Windows has come with its runtimes installed. Not sure about vista.
Theres also the Windows Scripting Host that uses VBScript and JScript.
http://en.wikipedia.org/wiki/Windows_Script_Host
Lua is an excellent choice for that kind of stuff. You can integrate it in your executable or use the standalone Lua interpreter to run your scripts.
While waiting for answers I ran across Shoes, which can make Ruby .exe (I'm most familiar with Ruby) I got it mostly working, although the size of 2.4MB was a bit larger than I'd like. However, I found that it would crash when changing application focus.
I switched to a 'regular' terminal script, and found rubyscript2exe, which, after working around a problem with rubygems, seems to work, and creates a ~700kb file.
I did rather like some of the options presented, but it's not worth redeveloping at this point.
Python with py2exe. Depends on what you mean by small though.
Would using PowerShell script be something you've considered. The data processing might be richer there.
Why not knock up a .NET application? There are free editions of the IDE, and the Framework comes with Windows as a standard component (which also includes a C# compiler, as it happens.)

What is the quickest path to writing a lightweight GUI program on Windows?

I want a small (< 30MB) standalone Windows executable (a single file) that creates a window which asks the user for the location of a directory and then launches a different program in that directory.
This executable has to run on XP, Vista, Server 2003, and Server 2008 versions of Windows in 32-bits and 64 bits on x86-64 architecture as well as Itanium chips.
It would be spectacular if we only had to build it once in order to run it on all these platforms, but that is not a requirement. This is for a proprietary system, so GPL code is off-limits.
What is the fastest way to put this together?
These are some things I'm looking into, so if you have info about their viability, I'm all about it:
Perl/Tk using perl2exe to get the binary.
Ruby with wxruby
Learn MFC programming and do it the right way like everybody else.
What about a WSH script? It won't be an exe, right, but to ask for a folder I don't see the need for an exe file, much less a 30Mb one...
A 1Kb script, save it as whatever name you like with vbs extension and run it. This, in case it's not clear, asks you for a folder name and then runs calc.exe from the system32 subdirectory. You can of course do a lot better than this in 2 or 4 Kb.
Set WshShell = WScript.CreateObject("WScript.Shell")
win = InputBox("Please type your Windows folder location.")
If Right(win,1) <> "\" Then
win = win & "\"
End If
WshShell.Run win & "system32\calc.exe"
To add a Folder Browser dialog instead of an InputBox, check this out.
Clear benefits are:
Simplicity (well, VB is ugly, but you can use JScript if you prefer), no need to compile it!
Compatibility, works on every windows machine I have available (from 98 onwards)
I'd use .NET and WinForms. The idea of scripted solution is appealing, but in practice I often find you end up jumping through hoops to do anything beyond the basic case and still don't have the flexibility to do everything you want.
Quickest way on Windows for a lightweight and fast GUI? One word.. Delphi! It lacks the 64 bit support for now but then FreePascal would come to the rescue.
Having a small stand-alone application and developing it quickly are, I'm sorry to say, usually conflicting requirements.
To be honest, given how incredibly simple the application is, I would write it in C with direct Win32 calls: one call to SHBrowseForFolder() to get the directory, and one to ShellExecuteEx() to run the program. Even MFC is far too heavy-weight for such a modest application. Set the C runtime to be statically linked and you should be able to keep the size of the stand-alone executable to less than 100k. A decent Windows C coder should be able to knock that up in less than an hour, assuming you have one to hand.
Python with either wxWidgets or Tkinter should be able to do this with almost no effort at all. Runs on everything, and py2exe will get you a standalone executable.
Tcl/tk is one solution. You can have a single file executable (including custom images, dlls, etc) using something called a "starpack" -- a virtual filesystem that is both tcl interpreter and application code. I think it would weigh in at maybe a couple megabytes.
From your specifications it would take me personally maybe 15 minutes to get a first working version.
Tcl/Tk has a BSD license.
For all of its flaws, Visual Basic has historically been great for super-simple apps like this.
I agree with the Tcl/Tk answer above. For more information about the starpack that he refers to, see: http://www.equi4.com/tclkit/ it's a Tcl/Tk interpreter available for various OS's all in about 1MB. In the past there apparently has been concerned about the look and feel of Tcl/Tk UI's, but this has been addressed by a new framework named "Tile" that supports the native look and feel of the user's OS.
For a quick and dirty GUI program like you said, you can use an AutoIt script. You can even compile to an exe.
For an GUI example of AutoIt, you can check my stdout redirect script in a previous answer here
wxWidgets; it's cross platform, free, open source and easy to learn
You could do this in MFC and have an executable in under 100k. In general, if you want to keep the size of your executables down, you can use UPX to perform exe compression. If you want an example, take a look at uTorrent. It's a full featured BitTorrent app in less than 300k of executable.
I use HTA (HTML Application) for quick-and-dirty form & script applications. See Microsoft's HTA Developers Center for details and examples. This basically uses HTML for the form, and any HTML-accessible scripting language for the script. Normal browser security is bypassed so that you can get at almost all OS internals. The above site also contains links to several tools that nearly automate the scripting part for you.
PyQt works really well for this. Binaries for Windows here:
http://www.riverbankcomputing.co.uk/software/pyqt/download
A good book here:
http://www.amazon.com/Programming-Python-Prentice-Software-Development/dp/0132354187/ref=sr_1_1?ie=UTF8&qid=1295369454&sr=8-1
And you can freeze these using various methods if you need exe(s).
Similar to what Vinko Vrsalovic said, you can use a HTA application. It is as easy as building a webpage with windows scripting host functionality. I have built a few utilities with jscript and it is really easy and quick
http://msdn.microsoft.com/en-us/library/ms536496(VS.85).aspx
These responses are unbelievable.
Visual Studio Forms editor lets you draw out WinForms and autogenerates the boilerplate GUI code (which is a pain in the ass at best for most other languages and toolkits). Then you can use C# or any other .NET language. .NET has stock widgets for file pickers. I could write that script in 20 minutes and it will run on every one of your target platforms for free. Draw out the GUI, drag-n-drop a file picker, fill out maybe two hooks to do the "launch a different file than they wanted" thing, done.
I suggest Autohotkey (AHK) or Autoit. Both support win95+ (with caveats for certain functions). They can be compiled into small .exe without external dependency (besides native DLL's).
Pros:
small size
easy to write code
useful for simple - complex operations
can create GUI easily
Cons:
learning curve (as syntax is unusual)
30MB is pretty huge!
Qt (C++) may be the best choice. It is portable, quick to develop and relatively fast to run. With UPX (Ultimate Packer for eXecutables) your program will be 10M+.
Qt (Python) is OK too, but will be slower.
If you want it less than 1M and/or you want it quick, you can write it in C with win32 api, or use Delphi.

Resources