How to write memory of a dll? - winapi

I'm trying to replace a specific string in memory belongs to a dll. Here's the code.
I can read it, and it gives me correct result, but when writing VC++ shows 'Access violation writing location'.
HMODULE HMODULE1 = LoadLibrary(L"my.dll");
std::string x1(8, '\0');
std::string x2 = "CIFCDMEY";
auto startPos = (void*)((char*)(HMODULE1)+0x1158A0 + 9);
// Correct, I can read the memory
memcpy_s((void*)x1.data(), x1.size(), startPos, x1.size());
// Access violation writing location
memcpy_s(startPos, x2.size(), x2.data(), x2.size());
auto handle = GetCurrentProcess();
SIZE_T num;
auto ret = WriteProcessMemory(handle, startPos, x2.data(), x2.size(), &num);
auto lastError1 = GetLastError();
LPVOID lpMessageBuffer1 = NULL;
size_t size1 = ::FormatMessage(
FORMAT_MESSAGE_ALLOCATE_BUFFER | FORMAT_MESSAGE_FROM_SYSTEM | FORMAT_MESSAGE_IGNORE_INSERTS,
NULL,
lastError1,
MAKELANGID(LANG_NEUTRAL, SUBLANG_DEFAULT),
(LPTSTR)&lpMessageBuffer1,
0,
NULL);
std::wstring errorMessage1;
if (size1 > 0) {
// errorMessage1: Invalid access to memory location.
errorMessage1 = std::wstring((LPCTSTR)lpMessageBuffer1, size1);
}
In 'Watch' window, the value of variable startPos is my.dll!0x0f2a58a9 (load symbols for additional information).
I know people use 'WriteProcessMemory' to write memory of a process, how about a dll?

If the target memory page does not have write permissions you need to take them using VirtualProtect()
I use a simple wrapper function for all my patching, it uses the PAGE_EXECUTE_READWRITE memory protection constant because if you are modifying a code page this will avoid crashes when the instruction pointer lands in the same memory page and you only used PAGE_READWRITE
void Patch(char* dst, char* src, const intptr_t size)
{
DWORD oldprotect;
VirtualProtect(dst, size, PAGE_EXECUTE_READWRITE, &oldprotect);
memcpy(dst, src, size);
VirtualProtect(dst, size, oldprotect, &oldprotect);
}

Related

Windows memory metric to detect memory leak

We have large old legacy server code running as a 64bit windows service.
The service has a memory leak which, at the moment, we do not have the resources to fix.
As the service is resilient to restart, a temporary terrible 'solution' we want is to detect when the service's memory exceeded, e.g., 5GB, and exit the service (which has auto restart for such cases).
My question is which metric should I go for? Is using GlobalMemoryStatusEx to get
MEMORYSTATUSEX.ullTotalVirtual- MEMORYSTATUSEX.ullAvailVirtual right?
GlobalMemoryStatusEx is wrong. You do not want to fill up the machine memory until 5 GB are left in total.
You need GetProcessMemoryInfo.
BOOL WINAPI GetProcessMemoryInfo(
__in HANDLE Process,
__out PPROCESS_MEMORY_COUNTERS ppsmemCounters,
__in DWORD cb
);
From an example using GetProcessMemoryInfo:
#include <windows.h>
#include <stdio.h>
#include <psapi.h>
// To ensure correct resolution of symbols, add Psapi.lib to TARGETLIBS
// and compile with -DPSAPI_VERSION=1
void PrintMemoryInfo( DWORD processID )
{
HANDLE hProcess;
PROCESS_MEMORY_COUNTERS pmc;
// Print the process identifier.
printf( "\nProcess ID: %u\n", processID );
// Print information about the memory usage of the process.
hProcess = OpenProcess( PROCESS_QUERY_INFORMATION |
PROCESS_VM_READ,
FALSE, processID );
if (NULL == hProcess)
return;
if ( GetProcessMemoryInfo( hProcess, &pmc, sizeof(pmc)) )
{
printf( "\tWorkingSetSize: 0x%08X\n", pmc.WorkingSetSize );
printf( "\tPagefileUsage: 0x%08X\n", pmc.PagefileUsage );
}
CloseHandle( hProcess );
}
int main( void )
{
// Get the list of process identifiers.
DWORD aProcesses[1024], cbNeeded, cProcesses;
unsigned int i;
if ( !EnumProcesses( aProcesses, sizeof(aProcesses), &cbNeeded ) )
{
return 1;
}
// Calculate how many process identifiers were returned.
cProcesses = cbNeeded / sizeof(DWORD);
// Print the memory usage for each process
for ( i = 0; i < cProcesses; i++ )
{
PrintMemoryInfo( aProcesses[i] );
}
return 0;
}
Although unintuitive you need to read PagefileUsage which gets you the committed memory which was allocated by your process. WorkingSetSize is unreliable because if the machine gets tight on memory the OS will write all data to the page file. That can cause WorkingSetSize to be small (e.g. 100 MB) but in reality you leaked already 20 GB of memory. This would result in a saw tooth pattern in memory consumption until the page file is full. Working set is only the actively used memory which might hide the multi GB memory leak if the machine is under memory pressure.

How to identify what parts of the allocated virtual memory a process is using

I want to be able to search through the allocated memory of a process (say you open notepad and type “HelloWorld” then ran the search looking for the string “HelloWorld”). For 32bit applications this is not a problem but for 64 bit applications the large quantity of allocated virtual memory takes hours to search through.
Obviously the vast majority of applications are not utilising the full amount of virtual memory allocated. I can identify the areas in memory allocated to each process with VirtualQueryEX and read them with ReadProcessMemory but when it comes to 64 bit applications this still takes hours to complete.
Does anyone know of any resources or any methods that could be used to help narrow down the amount of memory to be searched?
It is important that you only scan proper memory. If you just scanned from 0x0 to 0xFFFFFFFFF it would take at least 5 seconds in most processes. You can skip bad regions of memory by checking the memory page settings by using VirtualQueryEx. This will retrieve a MEMORY_BASIC_INFORMATION which will define the state of that memory region.
If the MemoryBasicInformation.state is not MEM_COMMIT then it is bad memory
If the MBI.Protect is PAGE_NOACCESS you also want to skip this memory.
If VirtualQuery fails then you skip to the next region.
In this manner it should only take 0-2 seconds to scan the memory on your average process because it is only scanning good memory.
char* ScanEx(char* pattern, char* mask, char* begin, intptr_t size, HANDLE hProc)
{
char* match{ nullptr };
SIZE_T bytesRead;
DWORD oldprotect;
char* buffer{ nullptr };
MEMORY_BASIC_INFORMATION mbi;
mbi.RegionSize = 0x1000;//
VirtualQueryEx(hProc, (LPCVOID)begin, &mbi, sizeof(mbi));
for (char* curr = begin; curr < begin + size; curr += mbi.RegionSize)
{
if (!VirtualQueryEx(hProc, curr, &mbi, sizeof(mbi))) continue;
if (mbi.State != MEM_COMMIT || mbi.Protect == PAGE_NOACCESS) continue;
delete[] buffer;
buffer = new char[mbi.RegionSize];
if (VirtualProtectEx(hProc, mbi.BaseAddress, mbi.RegionSize, PAGE_EXECUTE_READWRITE, &oldprotect))
{
ReadProcessMemory(hProc, mbi.BaseAddress, buffer, mbi.RegionSize, &bytesRead);
VirtualProtectEx(hProc, mbi.BaseAddress, mbi.RegionSize, oldprotect, &oldprotect);
char* internalAddr = ScanBasic(pattern, mask, buffer, (intptr_t)bytesRead);
if (internalAddr != nullptr)
{
//calculate from internal to external
match = curr + (internalAddr - buffer);
break;
}
}
}
delete[] buffer;
return match;
}
ScanBasic is just a standard comparison function which compares your pattern against the buffer.
Second, if you know the address is relative to a module, only scan the address range of that module, you can get the size of the module via ToolHelp32Snapshot. If you know it's dynamic memory on the heap, then only scan the heap. You can get all the heaps also with ToolHelp32Snapshot and TH32CS_SNAPHEAPLIST.
You can make a wrapper for this function as well for scanning the entire address space of the process might look something like this
char* Pattern::Ex::ScanProc(char* pattern, char* mask, ProcEx& proc)
{
unsigned long long int kernelMemory = IsWow64Proc(proc.handle) ? 0x80000000 : 0x800000000000;
return Scan(pattern, mask, 0x0, (intptr_t)kernelMemory, proc.handle);
}

Problem with writing out large temp files

I have a set of large image files that I'm using as temporary swap files on Windows in Visual Studio 2010. I'm writing and reading the files out as necessary.
Problem is, even though each of the files are the same size, I'm getting different file sizes.
So, I can do:
template <typename T>
std::string PlaceFileOnDisk(T* inImage, const int& inSize)
TCHAR lpTempPathBuffer[MAX_PATH];
TCHAR szTempFileName[MAX_PATH];
DWORD dwRetVal = GetTempPath(MAX_PATH, lpTempPathBuffer);
UINT uRetVal = GetTempFileName(lpTempPathBuffer, TEXT("IMAGE"), 0, szTempFileName);
FILE* fp;
fopen_s(&fp, szTempFileName, "w+");
fwrite(inImage, sizeof(T), inSize, fp);
fclose(fp);
std::string theRealTempFileName(szTempFileName);
return theRealTempFileName;
}
but that results in files between 53 and 65 mb in size (the image is 4713 * 5908 * sizeof (unsigned short).
I figured that maybe that 'fwrite' might not be stable for large files, so I broke things up into:
template <typename T>
std::string PlaceFileOnDisk(T* inImage, const int& inYSize, const int& inXSize)
TCHAR lpTempPathBuffer[MAX_PATH];
TCHAR szTempFileName[MAX_PATH];
DWORD dwRetVal = GetTempPath(MAX_PATH, lpTempPathBuffer);
UINT uRetVal = GetTempFileName(lpTempPathBuffer, TEXT("IMAGE"), 0, szTempFileName);
int y;
FILE* fp;
for (y = 0; y < inYSize; y++){
fopen_s(&fp, szTempFileName, "a");
fwrite(&(inImage[y*inXSize]), sizeof(T), inXSize, fp);
fclose(fp);
}
std::string theRealTempFileName(szTempFileName);
return theRealTempFileName;
}
Same thing: the files that are saved to disk are variable sized, not the expected size.
What's going on? Why are they not the same?
The read in function:
template <typename T>
T* RecoverFileFromDisk(const std::string& inFileName, const int& inSize){
T* theBuffer = NULL;
FILE* fp;
try {
theBuffer = new T[inYSize*inXSize];
fopen_s(&fp, inFileName.c_str(), "r");
fread(theBuffer, sizeof(T), inSize, fp);
fclose(fp);
}
catch(...){
if (theBuffer != NULL){
delete [] theBuffer;
theBuffer = NULL;
}
}
return theBuffer;
}
This function may be suffering from similar problems, but I'm not getting that far, because I can't get past the writing function.
I did try to use the read/write information on this page:
http://msdn.microsoft.com/en-us/library/aa363875%28v=vs.85%29.aspx
But the suggestions there just didn't work at all, so I went with the file functions with which I'm more familiar. That's where I got the temp file naming conventions, though.
Are you able to open the image after it's written? It sounds like you're having trouble writing it, too?
You're just asking about why the file sizes are different for the same size picture? What about how the sizes of the initial files compare to each other? It may have something to do with the how the initial image files are compressed.
I'm not sure what you're doing with the files, but have you considered a more basic "copy" function?

SetFilePointerEx fails to read physical disk beyond size of LONG

It's taken a few years, but I am finally taking the plunge into VC++. I need to be able to read x number of sectors of a physical device (namely a hard drive). I am using the CreateFile() and SetFilePointerEx() and ReadFile() APIs.
I have done a LOT of reading online in all the major forums about this topic. I have exhausted my research and now I feel it's time to ask the experts to weigh in on this dilemma. As this is my very first post ever on this topic, please go easy on my :)
I should also point out that this is a .DLL that I consume with a simple C# app. The plumbing all works fine. It's the SetFilePointer(Ex)() APIs that are causing me grief.
I can get the code to work up until about the size of a LONG (4,xxx,xxx) - I can't remember the exact value. It suffices to say that I can read everything up to and including sector # 4,000,000 but not 5,000,000 or above. The problem lies in the "size" of the parameters for the SetFilePointer() and SetFilePointerEx() APIs. I've tried both and so far, SetFilePointerEx() seems to be what I should use to work on 64-bit systems.
The 2nd and 3rd parameters of the SetFilePointer are defined as follows:
BOOL WINAPI SetFilePointerEx(
__in HANDLE hFile,
__in LARGE_INTEGER liDistanceToMove,
__out_opt PLARGE_INTEGER lpNewFilePointer,
__in DWORD dwMoveMethod
);
Please note that I have tried passing the LowPart and the HighPart as the 2nd and 3 parameters without any success as I get a CANNOT CONVERT LARGE_INTEGER TO PLARGE_INTEGER (for parameter 3).
HERE IS MY CODE. I USE A CODE-BREAK TO VIEW buff[0], etc. I would like to read past the 4,xxx,xxx limitation. Obviously I am doing something wrong. Each read past this limit resets my file pointer to sector 0.
#include "stdafx.h"
#include <windows.h>
#include <conio.h>
extern "C"
__declspec(dllexport) int ReadSectors(long startSector, long numSectors)
{
HANDLE hFile;
const int SECTOR_SIZE = 512;
const int BUFFER_SIZE = 512;
LARGE_INTEGER liDistanceToMove;
PLARGE_INTEGER newFilePtr = NULL; // not used in this context.
// just reading from START to END
liDistanceToMove.QuadPart = startSector * SECTOR_SIZE;
DWORD dwBytesRead, dwPos;
LPCWSTR fname = L"\\\\.\\PHYSICALDRIVE0";
char buff[BUFFER_SIZE];
// Open the PHYSICALDEVICE as a file.
hFile = CreateFile(fname,
GENERIC_READ | GENERIC_WRITE,
FILE_SHARE_READ | FILE_SHARE_WRITE,
NULL,
OPEN_EXISTING,
FILE_ATTRIBUTE_NORMAL,
NULL);
// Here's the API definition
/*BOOL WINAPI SetFilePointerEx(
__in HANDLE hFile,
__in LARGE_INTEGER liDistanceToMove,
__out_opt PLARGE_INTEGER lpNewFilePointer,
__in DWORD dwMoveMethod
);*/
dwPos = SetFilePointerEx(hFile, liDistanceToMove, NULL, FILE_BEGIN);
if(ReadFile(hFile, buff, BUFFER_SIZE, &dwBytesRead, NULL))
{
if(dwBytesRead > 5)
{
BYTE x1 = buff[0];
BYTE x2 = buff[1];
BYTE x3 = buff[2];
BYTE x4 = buff[3];
BYTE x5 = buff[4];
}
}
// Close both files.
CloseHandle(hFile);
return 0;
}
startSector * SECTOR_SIZE;
startSector is a long (32bits), SECTOR_SIZE is a int (also 32bits), multiply these two guys and the intermediate result is going to be a long, which will overflow and you then stuff it into the __int64 of the LARGE_INTEGER, which is too late. You want to operate on __int64s, something like
liDistanceToMove.QuadPart = startSector;
liDistanceToMove.QuadPart *= SECTOR_SIZE;
for example.

How does one use VirtualAllocEx do make room for a code cave?

How does one use VirtualAllocEx do make room for a code cave? I am currently in possession of a piece of software with very little "free space" and I read that VirtualAllocEx is used for making this space..
After the question about "code cave" is cleared, you can find interesting following code which enumerate blocks allocated by VirtualAllocEx in the current process and find all PE (DLLs and the EXE itself).
SYSTEM_INFO si;
MEMORY_BASIC_INFORMATION mbi;
DWORD nOffset = 0, cbReturned, dwMem;
GetSystemInfo(&si);
for (dwMem = 0; dwMem<(DWORD)si.lpMaximumApplicationAddress;
dwMem+=mbi.RegionSize) {
cbReturned = VirtualQueryEx (GetCurrentProcess(), (LPCVOID)dwMem, &mbi,
sizeof(mbi));
if (cbReturned) {
if ((mbi.AllocationProtect & PAGE_EXECUTE_WRITECOPY) &&
(mbi.Protect & (PAGE_EXECUTE | PAGE_EXECUTE_READ |
PAGE_EXECUTE_READWRITE | PAGE_EXECUTE_WRITECOPY))) {
if (*(LPWORD)mbi.AllocationBase == IMAGE_DOS_SIGNATURE) {
IMAGE_DOS_HEADER *pDosHeader =
(IMAGE_DOS_HEADER *)mbi.AllocationBase;
if (pDosHeader->e_lfanew) {
IMAGE_NT_HEADERS32 *pNtHeader = (IMAGE_NT_HEADERS32 *)
((PBYTE)pDosHeader + pDosHeader->e_lfanew);
if (pNtHeader->Signature != IMAGE_NT_SIGNATURE)
continue;
// now you can examine of module loaded in current process
}
}
}
}
}
The code could looks like a large loop. In reality it is a typical application it makes about 200 loops, so it is very quickly to goes through all blocks allocated with respect of VirtualAllocEx during loading of EXE all all depended DLLs.
#include <stdio.h>
#include <windows.h>
#include <commctrl.h>
unsigned long pid;
HANDLE process;
GetWindowThreadProcessId(listview, &pid);
process = OpenProcess(PROCESS_VM_OPERATION|PROCESS_VM_READ | PROCESS_VM_WRITE|PROCESS_QUERY_INFORMATION, FALSE, pid);
int *vptr = (int *)VirtualAllocEx(process, NULL, sizeof(int), MEM_COMMIT, PAGE_READWRITE);
References
- MSDN VirtualAllocEx Function
- CodeProject Stealing Program's Memory
- StackOver What is a code cave... ?
HTH,

Resources