Anti debugging - Preventing memory dumps - debugging

I am trying to implement some basic anti debugging functionality in my application. One area that I wanted to focus on in particular, is attempting to prevent people from easily taking a usable memory dump from my application. I read the article at:
http://www.codeproject.com/KB/security/AntiReverseEngineering.aspx
and that gave me a lot of tips for how to detect if a debugger is present, as well as some information on how I might prevent memory dumps. But the author notes that one should be careful about using these techniques, such as removing the executable header in memory. He mentions that there might be times when the OS or other programs may want to use this information, but I cannot see for what purpose.
Has anyone got some other tips as to how I could stop reverse engineers from dumping my program?
I am on Windows.
Kind regards,
Philip Bennefall

There is no reasonable way to prevent someone from capturing a memory dump of your process. For example, I could attach a kernel debugger to the system, break all execution, and extract your process' dump from the debugger. Therefore, I would focus on making analysis more difficult.
Here are some ideas:
Obfuscate and encrypt your executable code. Decrypt in-memory only, and do not keep decrypted code around for longer than you need it.
Do not store sensitive information in memory for longer than necessary. Use RtlZeroMemory or a similar API to clear out buffers that you are no longer using. This also applies to the stack (local variables and parameters).

Related

Make windbg or kd attached to local kernel behave like system wide strace

I am running Windows 7 on which I want to do kernel debugging and I do not want to mess with boot loader. So I've downloaded LiveKd as suggested here and make it run and seems it is working. If I understand correct it is some kind of read only debugging. Here is mentioned that it is very limited and even breakpoint cannot be used. I would like to ask if is possible in this mode to periodically dump all the instructions that are being executed or basically all events which are happening on current OS? I would like to have some system wide strace (Linux users know) and to do some statistical analysis on this. I suppose it depends on more factors like installed debug symbols to begin able resolve addresses etc.
I'm not sure if debugger is the best tool you can use for tracing live system calls. As you've mentioned LiveKd session is quite limited and you are not allowed to place breakpoints in it (otherwise you would hang your own system). However, you still can create memory dumps using the .dump command (check windbg help: .hh .dump). Keep in mind though that getting a full dump (/f) of a running system might take a lot of time.
Moving back to the subject of your question, by using the "dump approach" you will miss many system calls as you will have only snapshots of a system at given points in time. So if you are looking for something similar to Linux strace I would recommend checking those tools:
Process Monitor (procmon) - it's a tool which will show you all I/O requests in the system, as well as operations performed on the registry or process activity events
Windows Performance Toolkit - it contains tools for collecting (WPR) and analysing (WPA) system and application tracing events. It might be a lot of events and it's really important to filter them accordingly to your needs. ETW (Event Tracing for Windows) is a huge subject and you probably will need to read some tutorials or books before you will be able to use it effectively (but it's really worth it!).
API Monitor - it's one of many (I consider it as one of the best) tracing applications - this tool will allow you to trace method calls in any of the running processes. It has a nice interface and even allows you to place breakpoints on methods you'd like to intercept.
There are many other tools which might be used for tracing on Windows, but I would start with the ones I listed above. You may also check a great book on this subject: Inside Windows Debugging. Good luck! :)

How to read some data from a Windows application memory?

I have an application, which displays me some data. I need to attach to this app's process, find the data I need in memory (one single number, actually), and save it somewhere. This application doesn't seem to use standard windows controls, so things aren't going to be as simple as reading controls data using AutoIt or something similar.
Currently I'm a self-learner database guy and have quite shallow knowledge about windows apps debugging. Not even sure if I asked my question correctly enough.
So, can you give me some starter guidelines about, say, what should I read first, and general directions I should work on?
Thanks.
To read memory of other application you need to open the process with respect of OpenProcess with at least PROCESS_VM_READ access rights and then use ReadProcessMemory to read any memory address from the process. If you are an administrator or have debug privilege you will be able to open any process with maximal access rights, you need only to enable SeDebugPrivilege before (see for example http://support.microsoft.com/kb/131065).
If you don't know a much about the memory of the destination process you can just enumerate the memory blocks with respect of VirtualQueryEx (see How does one use VirtualAllocEx do make room for a code cave? as an example where I examine the program code. The program data you can examine in the same way).
The most practical problem which I see is that you ask your question in too general way. If you explain more what kind of the data you are looking for I could probably suggest you a better way. For example if you could see the data somewhere you could examine the corresponding windows and controls with respect of Spy++ (a part of Visual Studio Tools). The most important are the class of windows (or controls) and the messages which will be send at the moment when the most interesting window are displayed. You can also use Process Monitor to trace all file and registry access at the time when the windows with the interesting information will be displayed. At least at the beginning you should examine the memory of the process with ReadProcessMemory at the moment when the data which you are looking for are displayed on the window.
If you will have no success in your investigations I'd recommend you to insert in your question more information.
My primary advice is: try to find any other method of integration than this. Even if you succeed, you'll be hostage to any kinds of changes in the target process, and possibly in the Windows O/S. What you are describing is behaviour most virus scanners should flag and hinder: if not now, then in the future.
That said, you can take a look at DLL injection. However, it sounds as if you're going to have to debug the heck out of the target process at the disassembly level: otherwise, how are you going to know what memory address to read?
I used to know the windows debugging API but it's long lost memory. How about using ollydbg:
http://www.ollydbg.de/
And controlling that with both ollydbg script and autoit?
Sounds interesting... but very difficult. Since you say this is a 'one-off', what about something like this instead?
Take a screenshot of this application.
Run the screenshot through an OCR program
If you are able to read the text you are looking for in a predictable way, you're halfway there!
So now if you can read a OCR'd screenshot of your application, it is a simple matter of writing a program that does the following:
Scripts the steps to get the data on the screen
Creates a screenshot of the data in question
Runs it through an OCR program like Microsoft Office Document Imaging
Extracts the relevant text and does 'whatever' with it.
I have done something like this before with pretty good results, but I would say it is a fragile solution. If the application changes, it stops working. If the OCR can't read the text, it stops working. If the OCR reads the wrong text, it might do worse things than stop working...
As the other posters have said, reaching into memory and pulling out data is a pretty advanced topic... kudos to you if you can figure out a way to do that!
I know this may not be a popular answer, due to the nature of what this software is used for, but programs like CheatEngine and ArtMoney allow you to search through all the memory reserved by a process for a given value, then refine the results till you find the address of the value you're looking for.
I learned this initially while trying to learn how to better protect my games after coming across a trainer for one of them, but have found the technique occasionally useful when debugging.
Here is an example of the technique described above in use: https://www.youtube.com/watch?v=Nv04gYx2jMw&t=265

Is creating a memory dump at customer environment good?

I am facing a severe problem with my program, which gets reproduced only in the customer place. Putting logs, are not helping as I doubt the failure is happening in a third party dll. For some reasons, I couldn't get help from the library provider. I am thinking of producing a dump at the point of failure, so that to analyze it offline. Is this a recommended practice? Or any alternatives?
Yes, this is something that every program should have and utilize as often as possible.
I suggest that you don't use third party libraries. Create your own dumps instead. It's very simple and straight forward. You basically need to do the following:
Your program needs to access dbghelp.dll. It's a windows dll that allows you to create human readable call stacks etc. The debugger uses this dll to display data in your process. It also handles post mortem debugging, i.e. dumps of some sort. This dll can safely be distributed with your software. I suggest that you download and install Debugging Tools for Windows. This will give you access to all sorts of tools and the best tool WinDbg.exe and the latest dbghelp.dll is also in that distribution.
In dbghelp.dll you call e.g. MiniDumpWriteDump(), which will create the dump file and that's more or less it. You're done. As soon as you have that file in your hands, you can start using it. Either in the Visual Studio Debugger, which probably even might be associated with the .dmp file extension, or in WinDbg.
Now, there are a few things to think of while you're at it. When checking dump files like this, you need to generate .pdb files when you compile and link your executable. Otherwise there's no chance of mapping the dump data to human readable data, e.g. to get good callstacks and values of variables etc. This also means that you have to save these .pdb files. You need to be able to match them exactly against that very release. Since the dump files are date stamped with the date stamp of the executable, the debugger needs the exact pdb files. It doesn't matter if your code hasn't changed a single bit, if the .pdb files belong to another compilation session, you're toast.
I encourage every windows win32 developer to check out Oleg Starodumov's site DebugInfo.com. It contains a lot of samples and tutorials and how you can configure and tune your dump file generation. There are of course a myriad of ways to exclude certain data, create your custom debug message to attach to the dump etc.
Keep in mind that minidumps will contain very limited information about the application state at exception time. The trade off is a small file (around 50-100 kB depending on your settings). But if you want, you can create a full dump, which will contain the state of the whole application, i.e. globals and even kernel objects. These files can be HUGE and should only be used at extreme cases.
If there are legal aspects, just make sure your customers are aware of what you're doing. I bet you already have some contract where you aren't supposed to reveal business secrets or other legal aspects. If customers complain, convince them how important it is to find bugs and that this will improve the quality of the software drastically. More or less higher quality at the cost of nothing. If it doesn't cost them anything, that's also a good argument :)
Finally, here's another great site if you want to read up more on crash dump analysis: dumpanalysis.org
Hope this helps. Please comment if you want me to explain more.
Cheers !
Edit:
Just wanted to add that MiniDumpWriteDump() requires that you have a pointer to a MINIDUMP-EXCEPTION-INFORMATION (with underscores) struct. But the GetExceptionInformation() macro provides this for you at time of exception in your exception handler (structured exception handling or SEH):
__try {
}
__except (YourHandlerFunction(GetExceptionInformation())) {
}
YourHandlerFunction() will be the one taking care of generating the minidump (or some other function down the call chain). Also, if you have custom errors in your program, e.g. something happens that should not happen but technically is not an exception, you can use RaiseException() to create your own.
GetExceptionInformation() can only be used in this context and nowhere else during program execution.
Crash dumps are a pretty common troubleshooting method and can be very effective, especially for problems that only reproduce at the customer's site.
Just make sure the customer/client understands what you're doing and that you have permission. It's possible that a crash dump can have sensitive information that a customer may not want (or be permitted) to let walk out the door or over the wire.
Better than that there are libraries that will upload crash data back you.
BugDump and BugSplat
And there's the Microsoft way:
http://msdn.microsoft.com/en-us/library/aa936273.aspx
Disclaimer: I am not a lawyer, nor do I pretend to be one, this is not legal advice.
The data you can include in logs and crash dumps also depend on what domain you are working in. For example, medical equipment and patient information systems often contain sensitive data about patients that should not be visible to unauthorized persons.
The HIPAA Privacy Rule regulates
the use and disclosure of certain
information held by "covered entities"
(...) It establishes regulations for
the use and disclosure of Protected
Health Information (PHI). PHI is any
information held by a covered entity
which concerns health status,
provision of health care, or payment
for health care that can be linked to
an individual.[10] This is interpreted
rather broadly and includes any part
of an individual's medical record or
payment history. --Wikipedia
It should not be possible to link health information to an individual. The crash dumps and logs should be anonymized and stripped of any sensitive information, or not sent at all.
Maybe this does not apply to your specific case, so this is more of a general note. I think it applies to other domains that handle sensitive information, such as military and financial, and so on.
Basically the easiest way to produce a dumpfile is by using adplus. You don't need to change your code.
Adplus is part of the debugging tools for windows, as mentioned in the article above.
Adplus is basically a huge vbscript automation of windbg.
What you have to do to use adplus:
Download and install Debugging tools for windows to c:\debuggers
start your application
open a commandline and navigate to c:\debuggers
run this line "adplus -crash your_exe.exe"
reproduce the crash
you'll get a minidump with all the information you need.
you can open the crash dump in your favorite debugger.
within windbg, the command "analyze -v" helped me in at least 40% of all the crashes that only happened at customer site and were not reproducible in house.

Finding GDI/User resource usage from a crash dump

I have a crash dump of an application that is supposedly leaking GDI. The app is running on XP and I have no problems loading it into WinDbg to look at it. Previously we have use the Gdikdx.dll extension to look at Gdi information but this extension is not supported on XP or Vista.
Does anyone have any pointers for finding GDI object usage in WinDbg.
Alternatively, I do have access to the failing program (and its stress testing suite) so I can reproduce on a running system if you know of any 'live' debugging tools for XP and Vista (or Windows 2000 though this is not our target).
I've spent the last week working on a GDI leak finder tool. We also perform regular stress testing and it never lasted longer than a day's worth w/o stopping due to user/gdi object handle overconsumption.
My attempts have been pretty successful as far as I can tell. Of course, I spent some time beforehand looking for an alternative and quicker solution. It is worth mentioning, I had some previous semi-lucky experience with the GDILeaks tool from msdn article mentioned above. Not to mention that i had to solve a few problems prior to putting it to work and this time it just didn't give me what and how i wanted it. The downside of their approach is the heavyweight debugger interface (it slows down the researched target by orders of magnitude which I found unacceptable). Another downside is that it did not work all the time - on some runs I simply could not get it to report/compute anything! Its complexity (judging by the amount of code) was another scare-away factor. I'm not a big fan of GUIs, as it is my belief that I'm more productive with no windows at all ;o). I also found it hard to make it find and use my symbols.
One more tool I used before setting on to write my own, was the leakbrowser.
Anyways, I finally settled on an iterative approach to achieve following goals:
minor performance penalties
implementation simplicity
non-invasiveness (used for multiple products)
relying on as much available as possible
I used detours (non-commercial use) for core functionality (it is an injectible DLL). Put Javascript to use for automatic code generation (15K script to gen 100K source code - no way I code this manually and no C preprocessor involved!) plus a windbg extension for data analysis and snapshot/diff support.
To tell the long story short - after I was finished, it was a matter of a few hours to collect information during another stress test and another hour to analyze and fix the leaks.
I'll be more than happy to share my findings.
P.S. some time did I spend on trying to improve on the previous work. My intention was minimizing false positives (I've seen just about too many of those while developing), so it will also check for allocation/release consistency as well as avoid taking into account allocations that are never leaked.
Edit: Find the tool here
There was a MSDN Magazine article from several years ago that talked about GDI leaks. This points to several different places with good information.
In WinDbg, you may also try the !poolused command for some information.
Finding resource leaks in from a crash dump (post-mortem) can be difficult -- if it was always the same place, using the same variable that leaks the memory, and you're lucky, you could see the last place that it will be leaked, etc. It would probably be much easier with a live program running under the debugger.
You can also try using Microsoft Detours, but the license doesn't always work out. It's also a bit more invasive and advanced.
I have created a Windbg script for that. Look at the answer of
Command to get GDI handle count from a crash dump
To track the allocation stack you could set a ba (Break on Access) breakpoint past the last allocated GDICell object to break just at the point when another GDI allocation happens. That could be a bit complex because the address changes but it could be enough to find pretty much any leak.

Comparing cold-start to warm start

Our application takes significantly more time to launch after a reboot (cold start) than if it was already opened once (warm start).
Most (if not all) the difference seems to come from loading DLLs, when the DLLs' are in cached memory pages they load much faster. We tried using ClearMem to simulate rebooting (since its much less time consuming than actually rebooting) and got mixed results, on some machines it seemed to simulate a reboot very consistently and in some not.
To sum up my questions are:
Have you experienced differences in launch time between cold and warm starts?
How have you delt with such differences?
Do you know of a way to dependably simulate a reboot?
Edit:
Clarifications for comments:
The application is mostly native C++ with some .NET (the first .NET assembly that's loaded pays for the CLR).
We're looking to improve load time, obviously we did our share of profiling and improved the hotspots in our code.
Something I forgot to mention was that we got some improvement by re-basing all our binaries so the loader doesn't have to do it at load time.
As for simulating reboots, have you considered running your app from a virtual PC? Using virtualization you can conveniently replicate a set of conditions over and over again.
I would also consider some type of profiling app to spot the bit of code causing the time lag, and then making the judgement call about how much of that code is really necessary, or if it could be achieved in a different way.
It would be hard to truly simulate a reboot in software. When you reboot, all devices in your machine get their reset bit asserted, which should cause all memory system-wide to be lost.
In a modern machine you've got memory and caches everywhere: there's the VM subsystem which is storing pages of memory for the program, then you've got the OS caching the contents of files in memory, then you've got the on-disk buffer of sectors on the harddrive itself. You can probably get the OS caches to be reset, but the on-disk buffer on the drive? I don't know of a way.
How did you profile your code? Not all profiling methods are equal and some find hotspots better than others. Are you loading lots of files? If so, disk fragmentation and seek time might come into play.
Maybe even sticking basic timing information into the code, writing out to a log file and examining the files on cold/warm start will help identify where the app is spending time.
Without more information, I would lean towards filesystem/disk cache as the likely difference between the two environments. If that's the case, then you either need to spend less time loading files upfront, or find faster ways to load files.
Example: if you are loading lots of binary data files, speed up loading by combining them into a single file, then do a slerp of the whole file into memory in one read and parse their contents. Less disk seeks and time spend reading off of disk. Again, maybe that doesn't apply.
I don't know offhand of any tools to clear the disk/filesystem cache, but you could write a quick application to read a bunch of unrelated files off of disk to cause the filesystem/disk cache to be loaded with different info.
#Morten Christiansen said:
One way to make apps start cold-start faster (sort of) is used by e.g. Adobe reader, by loading some of the files on startup, thereby hiding the cold start from the users. This is only usable if the program is not supposed to start up immediately.
That makes the customer pay for initializing our app at every boot even when it isn't used, I really don't like that option (neither does Raymond).
One succesful way to speed up application startup is to switch DLLs to delay-load. This is a low-cost change (some fiddling with project settings) but can make startup significantly faster. Afterwards, run depends.exe in profiling mode to figure out which DLLs load during startup anyway, and revert the delay-load on them. Remember that you may also delay-load most Windows DLLs you need.
A very effective technique for improving application cold launch time is optimizing function link ordering.
The Visual Studio linker lets you pass in a file lists all the functions in the module being linked (or just some of them - it doesn't have to be all of them), and the linker will place those functions next to each other in memory.
When your application is starting up, there are typically calls to init functions throughout your application. Many of these calls will be to a page that isn't in memory yet, resulting in a page fault and a disk seek. That's where slow startup comes from.
Optimizing your application so all these functions are together can be a big win.
Check out Profile Guided Optimization in Visual Studio 2005 or later. One of the thing sthat PGO does for you is function link ordering.
It's a bit difficult to work into a build process, because with PGO you need to link, run your application, and then re-link with the output from the profile run. This means your build process needs to have a runtime environment and deal cleaning up after bad builds and all that, but the payoff is typically 10+ or more faster cold launch with no code changes.
There's some more info on PGO here:
http://msdn.microsoft.com/en-us/library/e7k32f4k.aspx
As an alternative to function order list, just group the code that will be called within the same sections:
#pragma code_seg(".startUp")
//...
#pragma code_seg
#pragma data_seg(".startUp")
//...
#pragma data_seg
It should be easy to maintain as your code changes, but has the same benefit as the function order list.
I am not sure whether function order list can specify global variables as well, but use this #pragma data_seg would simply work.
One way to make apps start cold-start faster (sort of) is used by e.g. Adobe reader, by loading some of the files on startup, thereby hiding the cold start from the users. This is only usable if the program is not supposed to start up immediately.
Another note, is that .NET 3.5SP1 supposedly has much improved cold-start speed, though how much, I cannot say.
It could be the NICs (LAN Cards) and that your app depends on certain other
services that require the network to come up. So profiling your application alone may not quite tell you this, but you should examine the dependencies for your application.
If your application is not very complicated, you can just copy all the executables to another directory, it should be similar to a reboot. (Cut and Paste seems not work, Windows is smart enough to know the files move to another folder is cached in the memory)

Resources