im writing a complex program that analyses users writing, and i have problem when running this application on 64bit OS.
Here is the code you can run to re-interprate the problem.
http://thetechnofreak.com/technofreak/keylogger-visual-c/
but of course, you need to have 64bit OS, since the program runs correctly on 32bit OS.
after this call
pKbd = pKbdLayerDescriptor();
this pointer equals NULL
pKbd->pVkToWcharTable
I have tried to google the solution first, and i found this
http://www.codeproject.com/Questions/211107/RegQueryValueEx-programcrash-on-64-Bit
its the exact same problem as i have, but there seem not to be a solution.
So do you have any ideas what can be wrong ?
There is this piece of code in the program and it seems that it takes care of the size differences between pointers on 32 and 64bit architecture
#if defined(BUILD_WOW6432)
#define KBD_LONG_POINTER __ptr64
#else
#define KBD_LONG_POINTER
#endif
But clearly, its not helping.
I've just had exactly the same issue with that piece of code.
I'll assume you're compiling to 32-bit but running on 64-bit as I am. If so, then first you need to define BUILD_WOW6432 before including kbd.h (or kbdext.h if you're using it). Secondly, use
SHGetFolderPath(NULL, CSIDL_SYSTEMX86, NULL, 0, systemDirectory)
instead of GetSystemDirectory(systemDirectory, MAX_PATH). This means that you always use the 32-bit code, even on 64-bit machines.
This solved the problem for me, hope it helps you :)
Related
Recently, i started learning about MPI programming and I have tried to program it on both Linux and Windows OS. I do not have any problem running the MPI application on Linux, however, i stumbled upon expression must have a constant value error on Visual Studio
For example, i'm trying to get the world_size via the MPI_Comm_size(MPI_COMM_WORLD, &world_size); and create an array based on the world_size (for example)
Code Sample :
#include <mpi.h>
int world_size;
MPI_Comm_size(MPI_COMM_WORLD, &world_size);
int database[world_size]; //error occured here
However, when i'm running it on Linux, it is working perfectly fine as i'm able to execute the code while stating the number of processes i wish to have. Am i missing out anything? I followed this particular youtube link that taught me how to install MS-MPI on my Visual Studio 2015.
Any help would be greatly appreciated.
Automatic array sizing using non const values actually works with gcc (https://gcc.gnu.org/onlinedocs/gcc/Variable-Length.html). However, it's considered a bad idea because (as you've just experienced) your code won't be portable anymore. You just need to change your code to create an array using new. You might want to generate an error to make sure your code is portable: Disable variable-length automatic arrays in gcc
On windows 64 bit, I've got a 32 bit process that reads the memory of other 32 bit processes, and I'd like it to be able to read 64 bit processes too.
ReadProcessMemory is being used to read the memory, but it has a 32 bit limitation. Is there any way of doing the equivalent of a ReadProcessMemory on a 64 bit process?
I know I could write a 64 bit process and launch that from my 32 bit process to do the work, but I'm wondering if there's some other option so that I don't need to write a 64 bit process.
Thanks.
It's possible.
For an example you may refer to the excellent sample in the answer of tofucoder.
For one more sample you may refer to this link.
For explanation why it actually works please check this thread.
Another sample may be found here.
The whole trick is to call 64-bit version of ReadProcessMemory function. Intuitively it's not an option from 32-bit process however the link above explains: x64 version of ntdll.dll is also loaded as a part of 32-bit process in Windows WOW64 emulator. It has a function called NtReadVirtualMemory with the same prototype as ReadProcessMemory64:
__declspec(SPEC)BOOL __cdecl ReadProcessMemory64(HANDLE hProcess, DWORD64 lpBaseAddress, LPVOID lpBuffer, SIZE_T nSize, SIZE_T *lpNumberOfBytesRead);
The address is 64-bit long and thus the whole virtual address space of 64-bit process may be referred.
You may wonder how to get the address of this function. It's when another function in ntdll.dll comes in handy: LdrGetProcedureAddress. Its prototype is the same as of GetProcAddress:
__declspec(SPEC)DWORD64 __cdecl GetProcAddress64(DWORD64 hModule, char* funcName);
We are to examine export directory of x64 ntdll.dll and manually found this function's entry. Then we can obtain address of any other function.
Another question is left uncovered so far: how to obtain start address of x64 ntdll.dll? We need to manually walk through x64 PEB structure of our process and traverse loaded modules' list - as one of the variants. And how to get PEB address? Please refer to the links above, not to overflow this post with too many details.
All this is covered in sample from the first link.
Alternative variants with usage of NtReadVirtualMemory & NtWow64ReadVirtualMemory64 functions are provided in second & third links (as well as alternative ways to get PEB address).
Summary: it is possible to interact with x64 process from x86 one. It can be done either with direct call to x64 version of function (from x64 ntdll.dll which is loaded as a part of WOW64 process) or with the call of specific x86 function which is intended to work with x64 process (namely NtWow64ReadVirtualMemory64).
P.S. One may say it's undocumented and is more like hack - but it's just not officially documented. Soft like Unlocker, ProcessHacker or ProcessExplorer, for example, makes use of these undocumented features (and many more), and it's up to you to decide, of course.
The library wow64ext seems to have solved this problem and offers a function ReadProcessMemory64 The Visual Studio Extension VSDebugTool seems to use this library and works for me with 64 bit processes.
Anyway, it shouldn't be impossibe because the (32 bit) Visual Studio Debugger handles 64 bit Debuggees very well.
No: http://blogs.msdn.com/b/oldnewthing/archive/2008/10/20/9006720.aspx
There's no way to get around this. One solution is to stop using the WOW64 emulator and write a 64 bit process. Another solution is to use IPC rather than direct memory reading.
ReadProcessMemory can read any size of memory including from x86 processes reading x64 processes.
You can without a problem, in an x86 program, do the following:
DWORD64 test = 0;
ReadProcessMemory(hProcess, (LPCVOID)lpBaseAddress, &test, sizeof(DWORD64), NULL);
Which would allow you to dereference an x64 pointer from a x86 process.
I'm sort of new to Windows GUI programming.
I got some code which works fine on 32-bit Windows but go weird on 64-bit Win7 (same exe).
LWG_CEDIT_GET( m_hwnd, IDC_EDIT_NUM_TEST, g_tmp_str, 4096 );
where LWG_CEDIT_GET is defined as:
#define LWG_CEDIT_GET(h,id,v,m) \
((*((U32*)(v))=(m)),SendMessage(GetDlgItem((h),(id)),EM_GETLINE,0,(LPARAM)(char*)(v))
On WinXP 32, this gives me g_tmp_str="1" (of course I inputted '1' into the textfield in dialog). But, on Win7 64, this gives me g_tmp_str=""(Oops, the weird character can't be shown in stackoverflow, whatever, odd char. [0]=49'1' [1]=16'').
Generally speaking, 32bit exe program can work flawlessly on Win7 64, so, why my program failed? Thanks.
Edit 1: IsWindowsUnicode(m_hwnd) returns FALSE.
See my last comments of the topic.
I'm working with an older version of OpenSSL, and I'm running into some behavior that has stumped me for days when trying to work with cross-platform code.
I have code that calls OpenSSL to sign something. My code is modeled after the code in ASN1_sign, which is found in a_sign.c in OpenSSL, which exhibits the same issues when I use it. Here is the relevant line of code (which is found and used exactly the same way in a_sign.c):
EVP_SignUpdate(&ctx,(unsigned char *)buf_in,inl);
ctx is a structure that OpenSSL uses, not relevant to this discussion
buf_in is a char* of the data that is to be signed
inl is the length of buf_in
EVP_SignUpdate can be called repeatedly in order to read in data to be signed before EVP_SignFinal is called to sign it.
Everything works fine when this code is used on Ubuntu and Windows 7, both of them produce the exact same signatures given the same inputs.
On OS X, if the size of inl is less than 64 (that is there are 64 bytes or less in buf_in), then it too produces the same signatures as Ubuntu and Windows. However, if the size of inl becomes greater than 64, it produces its own internally consistent signatures that differ from the other platforms. By internally consistent, I mean that the Mac will read the signatures and verify them as proper, while it will reject the signatures from Ubuntu and Windows, and vice versa.
I managed to fix this issue, and cause the same signatures to be created by changing that line above to the following, where it reads the buffer one byte at a time:
int input_it;
for(input_it = (int)buf_in; input_it < inl + (int)buf_in; intput_it++){
EVP_SIGNUpdate(&ctx, (unsigned char*) input_it, 1);
}
This causes OS X to reject its own signatures of data > 64 bytes as invalid, and I tracked down a similar line elsewhere for verifying signatures that needed to be broken up in an identical manner.
This fixes the signature creation and verification, but something is still going wrong, as I'm encountering other problems, and I really don't want to go traipsing (and modifying!) much deeper into OpenSSL.
Surely I'm doing something wrong, as I'm seeing the exact same issues when I use stock ASN1_sign. Is this an issue with the way that I compiled OpenSSL? For the life of me I can't figure it out. Can anyone educate me on what bone-headed mistake I must be making?
This is likely a bug in the MacOS implementation. I recommend you file a bug by sending the above text to the developers as described at http://www.openssl.org/support/faq.html#BUILD17
There are known issues with OpenSSL on the mac (you have to jump through a few hoops to ensure it links with the correct library instead of the system library). Did you compile it yourself? The PROBLEMS file in the distribution explains the details of the issue and suggests a few workarounds. (Or if you are running with shared libraries, double check that your DYLD_LIBRARY_PATH is correctly set). No guarantee, but this looks a likely place to start...
The most common issue porting Windows and Linux code around is default values of memory. I think Windows sets it to 0xDEADBEEF and Linux set's it to 0s.
Operating System: Windows XP 64 bit, SP2.
I have an unusual problem. I am porting some code from 32 bit to 64 bit. The 32 bit code works just fine. But when I call CreateThread() for the 64 bit version the call fails. I have three places where this fails. 2 call CreateThread(). 1 calls beginthreadex() which calls CreateThread().
All three calls fail with error code 0x3E6, "Invalid access to memory location".
The problem is all the input parameters are correct.
HANDLE h;
DWORD threadID;
h = CreateThread(0, // default security
0, // default stack size
myThreadFunc, // valid function to call
myParam, // my param
0, // no flags, start thread immediately
&threadID);
All three calls to CreateThread() are made from a DLL I've injected into the target program at the start of the program execution (this is before the program has got to the start of main()/WinMain()). If I call CreateThread() from the target program (same params) via say a menu, it works. Same parameters etc. Bizarre.
If I pass NULL instead of &threadID, it still fails.
If I pass NULL as myParam, it still fails.
I'm not calling CreateThread from inside DllMain(), so that isn't the problem. I'm confused and searching on Google etc hasn't shown any relevant answers.
If anyone has seen this before or has any ideas, please let me know.
Thanks for reading.
ANSWER
Short answer: Stack Frames on x64 need to be 16 byte aligned.
Longer answer:
After much banging my head against the debugger wall and posting responses to the various suggestions (all of which helped in someway, prodding me to try new directions) I started exploring what-ifs about what was on the stack prior to calling CreateThread(). This proved to be a red-herring but it did lead to the solution.
Adding extra data to the stack changes the stack frame alignment. Sooner or later one of the tests gets you to 16 byte stack frame alignment. At that point the code worked. So I retraced my steps and started putting NULL data onto the stack rather than what I thought was the correct values (I had been pushing return addresses to fake up a call frame). It still worked - so the data isn't important, it must be the actual stack addresses.
I quickly realised it was 16 byte alignment for the stack. Previously I was only aware of 8 byte alignment for data. This microsoft document explains all the alignment requirements.
If the stackframe is not 16 byte aligned on x64 the compiler may put large (8 byte or more) data on the wrong alignment boundaries when it pushes data onto the stack.
Hence the problem I faced - the hooking code was called with a stack that was not aligned on a 16 byte boundary.
Quick summary of alignment requirements, expressed as size : alignment
1 : 1
2 : 2
4 : 4
8 : 8
10 : 16
16 : 16
Anything larger than 8 bytes is aligned on the next power of 2 boundary.
I think Microsoft's error code is a bit misleading. The initial STATUS_DATATYPE_MISALIGNMENT could be expressed as a STATUS_STACK_MISALIGNMENT which would be more helpful. But then turning STATUS_DATATYPE_MISALIGNMENT into ERROR_NOACCESS - that actually disguises and misleads as to what the problem is. Very unhelpful.
Thank you to everyone that posted suggestions. Even if I disagreed with the suggestions, they prompted me to test in a wide variety of directions (including the ones I disagreed with).
Written a more detailed description of the problem of datatype misalignment here: 64 bit porting gotcha #1! x64 Datatype misalignment.
The only reason that 64bit would make a difference is that threading on 64bit requires 64bit aligned values. If threadID isn't 64bit aligned, you could cause this problem.
Ok, that idea's not it. Are you sure it's valid to call CreateThread before main/WinMain? It would explain why it works in a menu- because that's after main/WinMain.
In addition, I'd triple-check the lifetime of myParam. CreateThread returns (this I know from experience) long before the function you pass in is called.
Post the thread routine's code (or just a few lines).
It suddenly occurs to me: Are you sure that you're injecting your 64bit code into a 64bit process? Because if you had a 64bit CreateThread call and tried to inject that into a 32bit process running under WOW64, bad things could happen.
Starting to seriously run out of ideas. Does the compiler report any warnings?
Could the bug be due to a bug in the host program, rather than the DLL? There's some other code, such as loading a DLL if you used __declspec(import/export), that occurs before main/WinMain. If that DLLMain, for example, had a bug in it.
I ran into this issue today. And I checked every argument feed into _beginthread/CreateThread/NtCreateThread via rohitab's Windows API Monitor v2. Every argument is aligned properly (AFAIK).
So, where does STATUS_DATATYPE_MISALIGNMENT come from?
The first few lines of NtCreateThread validate parameters passed from user mode.
ProbeForReadSmallStructure (ThreadContext, sizeof (CONTEXT), CONTEXT_ALIGN);
for i386
#define CONTEXT_ALIGN (sizeof(ULONG))
for amd64
#define STACK_ALIGN (16UI64)
...
#define CONTEXT_ALIGN STACK_ALIGN
On amd64, if the ThreadContext pointer is not aligned to 16 bytes, NtCreateThread will return STATUS_DATATYPE_MISALIGNMENT.
CreateThread (actually CreateRemoteThread) allocated ThreadContext from stack, and did nothing special to guarantee the alignment requirement is satisfied. Things will work smoothly if every piece of your code followed Microsoft x64 calling convention, which unfortunately not true for me.
PS: The same code may work on newer Windows (say Vista and newer). I didn't check though. I'm facing this issue on Windows Server 2003 R2 x64.
I'm in the business of using parallel threads under windows
for calculations. No funny business, no dll-calls, and certainly
no call-back's. The following works in 32 bits windows. I set up the stack for my calculation, well within the area reserved for my program.
All releveant data about area's and start addresses is contained in
a data structure that is passed to CreateThread as parameter 3.
The address that is called contains a small assembler routine
that uses this data stucture.
Indeed this routine finds the address to return to on the stack,
then the address of the data structure.
There is no reason to go far into this. It just works and it calculates
the number of primes below 2,000,000,000 just fine, in one thread,
in two threads or in 20 threads.
Now CreateThread in 64 bits doesn't push the address of the data
structure. That seems implausible so I show you the smoking gun,
a dump of a debug session.
In the subwindow at the bottom right you see the stack, and
there is merely the return address, amidst a sea of zeroes.
The mechanism I use to fill in parameters is portable between 32 and 64 bits.
No other call exhibits a difference between word-sizes.
Moreover why would the code address work but not the data address?
The bottom line: one would expect that CreateThread passes the data parameter on the stack in the same way in 64 bits as in 32 bits, then does a subroutine call. At the assembler level it doesn't work that way. If there are any hidden requirements to e.g. RSP that are automatically fullfilled in C++ that would be very nasty.
P.S. No there are no 16 byte alignment problems. That lies ages behind me.
Try using _beginthread() or _beginthreadex() instead, you shouldn't be using CreateThread directly.
See this previous question.