ReadFile doesn't work asynchronously on Win7 and Win2k8 - winapi

According to MSDN, ReadFile can read data 2 different ways: synchronously and asynchronously.
I need the second one. The folowing code demonstrates usage with OVERLAPPED struct:
#include <windows.h>
#include <stdio.h>
#include <time.h>
void Read()
{
HANDLE hFile = CreateFileA("c:\\1.avi", GENERIC_READ, 0, NULL, OPEN_EXISTING, FILE_FLAG_OVERLAPPED, NULL);
if ( hFile == INVALID_HANDLE_VALUE )
{
printf("Failed to open the file\n");
return;
}
int dataSize = 256 * 1024 * 1024;
char* data = (char*)malloc(dataSize);
memset(data, 0xFF, dataSize);
OVERLAPPED overlapped;
memset(&overlapped, 0, sizeof(overlapped));
printf("reading: %d\n", time(NULL));
BOOL result = ReadFile(hFile, data, dataSize, NULL, &overlapped);
printf("sent: %d\n", time(NULL));
DWORD bytesRead;
result = GetOverlappedResult(hFile, &overlapped, &bytesRead, TRUE); // wait until completion - returns immediately
printf("done: %d\n", time(NULL));
CloseHandle(hFile);
}
int main()
{
Read();
}
On Windows XP output is:
reading: 1296651896
sent: 1296651896
done: 1296651899
It means that ReadFile didn't block and returned imediatly at the same second, whereas reading process continued for 3 seconds. It is normal async reading.
But on windows 7 and windows 2008 I get following results:
reading: 1296661205
sent: 1296661209
done: 1296661209.
It is a behavior of sync reading.
MSDN says that async ReadFile sometimes can behave as sync (when the file is compressed or encrypted for example). But the return value in this situation should be TRUE and GetLastError() == NO_ERROR.
On Windows 7 I get FALSE and GetLastError() == ERROR_IO_PENDING. So WinApi tells me that it is an async call, but when I look at the test I see that it is not!
I'm not the only one who found this "bug": read the comment on ReadFile MSDN page.
So what's the solution? Does anybody know? It is been 14 months after Denis found this strange behavior.

I don't know the size of the "c:\1.avi" file but the size of the buffer you give to Windows (256M!) is probably big enough to hold the file. So windows decides to read the whole file and put it in the buffer the way it likes. You don't say to windows "I want async", you say "I know how to handle async".
Just change the buffer size say 1024, and your program will behave exactly the same, but read only 1024 bytes (and return ERROR_IO_PENDING as well).
In general, you do asynchronous because you want to do something else during the operation. Look at the sample here: Testing for the End of a File, as it demonstrate an async ReadFile. If you change the sample's buffer and set it to a big value, it should behave exactly like yours.
PS: I suggest you don't rely on time samples to check things, use return codes and events

According to this, I would suspect that it should return TRUE in your case. But it may also be that the completion modes default settings are different on Win7/Win2k8.
Try setting a different mode with SetFileCompletionNotificationModes().

Have you tried to use an event as #Simon Mourier suggested ?. I know that the documentation says that the event is not required, but if you see the example in links provided by #Simon Mourier, it is using an event for asynchronous read.

Windows7/Server2008 have different behavior to resolve a race condition that can occurn in GetOverlappedResultEx. When you compile for these OS's Windows detects this and uses different behavior. I find this wicked confusing.
Here is a link:
http://msdn.microsoft.com/en-us/library/dd371711(VS.85).aspx
I'm sure you've read this many times in the past, but some of the text has changed since Win7 - esp the hEvent field in the OVERLAPPED struct,
http://msdn.microsoft.com/en-us/library/ms684342(v=VS.85).aspx
Functions such as
GetOverlappedResult and the
synchronization wait functions reset
auto-reset events to the nonsignaled
state. Therefore, you should use a
manual reset event; if you use an
auto-reset event, your application can
stop responding if you wait for the
operation to complete and then call
GetOverlappedResult with the bWait
parameter set to TRUE.
could you do an experiment - please allocate a manual reset event in your OVERLAPPED struct instead of a auto reset event? (I dont see the allocation in your snippit - dont forget to create the event and to set 'hEvent' after zeroing the struct)

This probably has something to do with caching. Try to open the file non-cached (FILE_FLAG_NO_BUFFERING)
EDIT
This is actually documented in the MSDN documentation for ReadFile:
Note If a file or device is opened
for asynchronous I/O, subsequent calls
to functions such as ReadFile using
that handle generally return
immediately, but can also behave
synchronously with respect to blocked
execution. For more information see
http://support.microsoft.com/kb/156932.

Related

Is there a race between starting and seeing yourself in WinApi's EnumProcesses()?

I just found this code in the wild:
def _scan_for_self(self):
win32api.Sleep(2000) # sleep to give time for process to be seen in system table.
basename = self.cmdline.split()[0]
pids = win32process.EnumProcesses()
if not pids:
UserLog.warn("WindowsProcess", "no pids", pids)
for pid in pids:
try:
handle = win32api.OpenProcess(
win32con.PROCESS_QUERY_INFORMATION | win32con.PROCESS_VM_READ,
pywintypes.FALSE, pid)
except pywintypes.error, err:
UserLog.warn("WindowsProcess", str(err))
continue
try:
modlist = win32process.EnumProcessModules(handle)
except pywintypes.error,err:
UserLog.warn("WindowsProcess",str(err))
continue
This line caught my eye:
win32api.Sleep(2000) # sleep to give time for process to be seen in system table.
It suggests that if you call EnumProcesses() too fast after starting, you won't see yourself. Is there any truth to this?
There is a race, but it's not the race the code tried to protect against.
A successful call to CreateProcess returns only after the kernel object representing the process has been created and enqueued into the kernel's process list. A subsequent call to EnumProcesses accesses the same list, and will immediately observe the newly created process object.
That is, unless the process object has since been destroyed. This isn't entirely unusual since processes in Windows are initialized in-process. The documentation even makes note of that:
Note that the function returns before the process has finished initialization. If a required DLL cannot be located or fails to initialize, the process is terminated.
What this means is that if a call to EnumProcesses immediately following a successful call to CreateProcess doesn't observe the newly created process, it does so because it was late rather than early. If you are late already then adding a delay will only make you more late.
Which swiftly leads to the actual race here: Process IDs uniquely identify processes only for a finite time interval. Once a process object is gone, its ID is up for grabs, and the system will reuse it at some point. The only reliable way to identify a process is by holding a handle to it.
Now it's anyone's guess what the author of _scan_for_self was trying to accomplish. As written, the code takes more time to do something that's probably altogether wrong1 anyway.
1 Turns out my gut feeling was correct. This is just your average POSIX developer, that, in the process of learning that POSIX is insufficient would rather call out Microsoft instead of actually using an all-around superior API.
The documentation for EnumProcesses (WIn32 API - EnumProcesses function), does not mention anything about a delay needed to see the current process in the list it returns.
The example from Microsoft how to use EnumProcess to enumerate all running processes (Enumerating All Processes), also does not contain any delay before calling EnumProcesses.
A small test application I created in C++ (see below) always reports that the current process is in the list (tested on Windows 10):
#include <Windows.h>
#include <Psapi.h>
#include <iostream>
#include <vector>
const DWORD MAX_NUM_PROCESSES = 4096;
DWORD aProcesses[MAX_NUM_PROCESSES];
int main(void)
{
// Get the list of running process Ids:
DWORD cbNeeded;
if (!EnumProcesses(aProcesses, MAX_NUM_PROCESSES * sizeof(DWORD), &cbNeeded))
{
return 1;
}
// Check if current process is in the list:
DWORD curProcId = GetCurrentProcessId();
bool bFoundCurProcId{ false };
DWORD numProcesses = cbNeeded / sizeof(DWORD);
for (DWORD i=0; i<numProcesses; ++i)
{
if (aProcesses[i] == curProcId)
{
bFoundCurProcId = true;
}
}
std::cout << "bFoundCurProcId: " << bFoundCurProcId << std::endl;
return 0;
}
Note: I am aware that the fact that the program reported the expected result does not mean that there is no race. Maybe I just couldn't catch it manifest. But trying to run code like that can give you a hint sometimes (especially if the result would have been that there is a race).
The fact that I never had a problem running this test (did it many times), together with the lack of any mention of the need for a delay in Microsoft's documentation make me believe that it is not required.
My conclusion is that either:
There is a unique issue when using it from python (doubt it).
or:
The code you found is doing something unnecessary.
There is no race.
EnumProcesses calls a NT API function that switches to kernel mode to walk the linked list of processes. Your own process has been added to the list before it starts running.

Reading pipe asynchronously using ReadFile

I think I need some clarification on how to read from a named pipe and have it return immediately, data or not. What I am seeing is ReadFile fails, as expected, but GetLastError returns either ERROR_IO_PENDING or ERROR_PIPE_NOT_CONNECTED, and it does this until my surrounding code times out. I get these errors EVEN THOUGH THE DATA HAS IN FACT ARRIVED. I know this by checking my read buffer and seeing what I expect. And the pipe keeps working. I suspect I am not using the overlapped structure correctly, I'm just setting all fields to zero. My code looks like this:
gPipe = CreateFile(gPipename, GENERIC_READ | GENERIC_WRITE, 0, NULL, OPEN_EXISTING, FILE_FLAG_OVERLAPPED, NULL);
pMode = PIPE_READMODE_MESSAGE;
bret = SetNamedPipeHandleState(gPipe, &pMode, NULL, NULL);
OVERLAPPED ol;
memset(&ol, 0, sizeof(OVERLAPPED));
// the following inside a loop that times out after a period
bret = ReadFile(gPipe, &tmostat, sizeof(TMO64STAT), NULL, &ol);
if (bret) break;
err = GetLastError();
// seeing err == ERROR_IO_PENDING or ERROR_PIPE_NOT_CONNECTED
So I can do what I want by ignoring the errors and checking for arrived data, but it troubles me. Any idea why I am getting this behavior?
Windows OVERLAPPED I/O doesn't work like the non-blocking flag on other OSes (For example on Linux, the closest equivalent is aio_*() API, not FIONBIO)
With OVERLAPPED I/O, the operation hasn't failed, it proceeds in the background. But you are never checking on it... you just try again. There's a queue of pending operations, you're always starting new ones, never checking on the old ones.
Fill in the hEvent field in the OVERLAPPED structure, and use it to detect when the operation completes. Then call GetOverlappedResult() to get the number of bytes actually transferred.
Another important note -- the OS owns the OVERLAPPED structure and the buffer until the operation completes, you must take care to make sure these stay valid and not deallocate them or use them for any other operation until you confirm that the first operation completed.
Note that there is an actual non-blocking mode for Win32 pipes, but Microsoft strongly recommends against using it:
The nonblocking-wait mode is supported for compatibility with Microsoft LAN Manager version 2.0. This mode should not be used to achieve overlapped input and output (I/O) with named pipes. Overlapped I/O should be used instead, because it enables time-consuming operations to run in the background after the function returns.
Named Pipe Type, Read, and Wait Modes

Serial I/O Overlapped/Non-Overlapped with Windows/Windows CE

I'm sorry this isn't much of a question, but more of to help people having problems with these particular things. The problem I'm working on requires the use of Serial I/O, but is primarily running under Windows CE 6.0. However, I was recently asked if the application could also be made to work under Windows too, so I set about solving this problem. I did spend quite a lot of time looking around to see if anyone had the answers I was looking for and it all came across as a lot of misinformation and things that were just basically wrong in some instances. So having solved this problem, I thought I'd share my findings with everyone so anyone encountering these difficulties would have answers.
Under Windows CE, OVERLAPPED I/O is NOT supported. This means that bi-directional communication through the serial port can be quite troublesome. The main problem being that when you are waiting on data from the serial port, you cannot send data because doing so will cause your main thread to block until the read operation completes or timeouts (depending on whether you've set timeouts up)
Like most people doing serial I/O, I had a reader serial thread set up for reading the serial port, which used WaitCommEvent() with an EV_RXCHAR mask to wait for serial data. Now this is where the difficulty arises with Windows and Windows CE.
If I have a simple reader thread like this, as an example:-
UINT SimpleReaderThread(LPVOID thParam)
{
DWORD eMask;
WaitCommEvent(thParam, &eMask, NULL);
MessageBox(NULL, TEXT("Thread Exited"), TEXT("Hello"), MB_OK);
}
Obviously in the above example, I'm not reading the data from the serial port or anything and I'm assuming that thParam contains the opened handle to the comm port etc. Now, the problem is under Windows when your thread executes and hits the WaitCommEvent(), your reader thread will go to sleep waiting for serial port data. Okay, that's fine and as it should be, but... how do you end this thread and get the MessageBox() to appear? Well, as it turns out, it's not actually that easy and is a fundamental difference between Windows CE and Windows in the way it does its Serial I/O.
Under Windows CE, you can do a couple of things to make the WaitCommEvent() fall through, such as SetCommMask(COMMPORT_HANDLE, 0) or even CloseHandle(COMMPORT_HANDLE). This will allow you to properly terminate your thread and therefore release the serial port for you to start sending data again. However neither of these things will work under Windows and both will cause the thread you call them from to sleep waiting on the completion of the WaitCommEvent(). So, how do you end the WaitCommEvent() under Windows? Well, ordinarily you'd use OVERLAPPED I/O and the thread blocking wouldn't be an issue, but since the solution has to be compatible with Windows CE as well, OVERLAPPED I/O isn't an option. There is one thing you can do under Windows to end the WaitCommEvent() and that is to call the CancelSynchronousIo() function and this will end your WaitCommEvent(), but be aware this can be device dependent. The main problem with CancelSynchronousIo() is that it isn't supported by Windows CE either, so you're out of luck using that for this problem!
So how do you do it? The fact is, to solve this problem, you simply can't use WaitCommEvent() as there is no way to terminate this function on Windows that is supported by Windows CE. That then leaves you with ReadFile() which again will block whilst it is reading NON OVERLAPPED I/O and this WILL work with Comm Timeouts.
Using ReadFile() and a COMMTIMEOUTS structure does mean that you will have to have a tight loop waiting for your serial data, but if you're not receiving large amount of serial data, it shouldn't be a problem. Also an event for ending your loop with a small timeout will also ensure that resources are passed back to the system and you're not hammering the processor at 100% load. Below is the solution I came up with and would appreciate some feedback, if you think it could be improved.
typedef struct
{
UINT8 sync;
UINT8 op
UINT8 dev;
UINT8 node;
UINT8 data;
UINT8 csum;
} COMMDAT;
COMSTAT cs = {0};
DWORD byte_count;
COMMDAT cd;
ZeroMemory(&cd, sizeof(COMMDAT));
bool recv = false;
do
{
ClearCommError(comm_handle, 0, &cs);
if (cs.cbInQue == sizeof(COMMDAT))
{
ReadFile(comm_handle, &cd, sizeof(COMMDAT), &byte_count, NULL);
recv = true;
}
} while ((WaitForSingleObject(event_handle, 2) != WAIT_OBJECT_0) && !recv);
ThreadExit(recv ? cd.data : 0xFF);
So to end the thread you just signal the event in the event_handle and that allow you to exit the thread properly and clean up resources and works correctly on Windows and Windows CE.
Hope that helps everyone who I've seen has had difficulty with this problem.
Since I think there was a misunderstanding in my comment above, here's more detail on two possible solutions that don't use a tight loop. Note that these use runtime determination and aretherefore fine under both OSes (though you have to compile for each target separately anyway) and since neither use an #ifdef it's less likely to end up breaking the compiler on one side or the other without you noticing immediately.
First, you could dynamically load CancelSynchonousIo and use it when present in the OS. Even optionally doing something instead of the Cancel for CE (like maybe closing the handle?);
typedef BOOL (WINAPI *CancelIo)(HANDLE hThread);
HANDLE hPort;
BOOL CancelStub(HANDLE h)
{
// stub for WinCE
CloseHandle(hPort);
}
void IoWithCancel()
{
CancelIo cancelFcn;
cancelFcn = (CancelIo)GetProcAddress(
GetModuleHandle(_T("kernel32.dll")),
_T("CancelSynchronousIo"));
// if for some reason you want something to happen in CE
if(cancelFcn == NULL)
{
cancelFcn = (CancelIo)CancelStub;
}
hPort = CreateFile( /* blah, blah */);
// do my I/O
if(cancelFcn != NULL)
{
cancelFcn(hPort);
}
}
The other option, which takes a bit more work as you're going to likely have different threading models (though if you're using C++, it would be an excellent case for separate classes based on platform anyway) would be to determine the platform and use overlapped on the desktop:
HANDLE hPort;
void IoWithOverlapped()
{
DWORD overlapped = 0;
OSVERSIONINFO version;
GetVersionEx(&version);
version.dwOSVersionInfoSize = sizeof(OSVERSIONINFO);
if((version.dwPlatformId == VER_PLATFORM_WIN32_WINDOWS)
|| (version.dwPlatformId == VER_PLATFORM_WIN32_NT))
{
overlapped = FILE_FLAG_OVERLAPPED;
}
else
{
// create a receive thread
}
hPort = CreateFile(
_T("COM1:"),
GENERIC_READ | GENERIC_WRITE,
FILE_SHARE_READ | FILE_SHARE_WRITE,
NULL,
OPEN_EXISTING,
overlapped,
NULL);
}

Change default Console I/O functions handle

Is it possible to somehow change standart I/O functions handle on Windows? Language preffered is C++. If I understand it right, by selecting console project, compiler just pre-allocate console for you, and operates all standart I/O functions to work with its handle. So, what I want to do is to let one Console app actually write into another app Console buffer. I though that I could get first´s Console handle, than pass it to second app by a file (I don´t know much about interprocess comunication, and this seems easy) and than somehow use for example prinf with the first app handle. Can this be done? I know how to get console handle, but I have no idea how to redirect printf to that handle. Its just study-purpose project to more understand of OS work behind this. I am interested in how printf knows what Console it is assiciated with.
If I understand you correctly, it sounds like you want the Windows API function AttachConsole(pid), which attaches the current process to the console owned by the process whose PID is pid.
If I understand you correct you can find the source code of application which you want to write in http://msdn.microsoft.com/en-us/library/ms682499%28VS.85%29.aspx. This example show how to write in stdin of another application and read it's stdout.
For general understanding. Compiler don't "pre-allocate console for you". Compiler use standard C/C++ libraries which write in the output. So if you use for example printf() the following code will be executed at the end will look like:
void Output (PCWSTR pszwText, UINT uTextLenght) // uTextLenght is Lenght in charakters
{
DWORD n;
UINT uCodePage = GetOEMCP(); // CP_OEMCP, CP_THREAD_ACP, CP_ACP
PSTR pszText = _alloca (uTextLenght);
// in the console are typically not used UNICODE, so
if (WideCharToMultiByte (uCodePage, 0, pszwText, uTextLenght,
pszText, uTextLenght, NULL, NULL) != (int)uTextLenght)
return;
WriteFile (GetStdHandle (STD_OUTPUT_HANDLE), pszText, uTextLenght, &n, NULL);
//_tprintf (TEXT("%.*ls"), uTextLenght, pszText);
//_puttchar();
//fwrite (pszText, sizeof(TCHAR), uTextLenght, stdout);
//_write (
}
So if one changes the value of STD_OUTPUT_HANDLE all output will be go to a file/pipe and so on. If instead of WriteFile the program use WriteConsole function such redirection will not works, but standard C/C++ library don't do this.
If you want redirect of stdout not from the child process but from the current process you can call SetStdHandle() directly (see http://msdn.microsoft.com/en-us/library/ms686244%28VS.85%29.aspx).
The "allocating of console" do a loader of operation system. It looks the word of binary EXE file (in the Subsystem part of IMAGE_OPTIONAL_HEADER see http://msdn.microsoft.com/en-us/library/ms680339%28VS.85%29.aspx) and if the EXE has 3 on this place (IMAGE_SUBSYSTEM_WINDOWS_CUI), than it use console of the parent process or create a new one. One can change a little this behavior in parameters of CreateProcess call (but only if you start child process in your code). This Subsystem flag of the EXE you define with respect of linker switch /subsystem (see http://msdn.microsoft.com/en-us/library/fcc1zstk%28VS.80%29.aspx).
If you want to redirect printf to a handle (FILE*), just do
fprintf(handle, "...");
For example replicating printf with fprintf
fprintf(stdout, "...");
Or error reporting
fprintf(stderr, "FATAL: %s fails", "smurf");
This is also how you write to files. fprintf(file, "Blah.");

GetIpAddrTable() leaks memory. How to resolve that?

On my Windows 7 box, this simple program causes the memory use of the application to creep up continuously, with no upper bound. I've stripped out everything non-essential, and it seems clear that the culprit is the Microsoft Iphlpapi function "GetIpAddrTable()". On each call, it leaks some memory. In a loop (e.g. checking for changes to the network interface list), it is unsustainable. There seems to be no async notification API which could do this job, so now I'm faced with possibly having to isolate this logic into a separate process and recycle the process periodically -- an ugly solution.
Any ideas?
// IphlpLeak.cpp - demonstrates that GetIpAddrTable leaks memory internally: run this and watch
// the memory use of the app climb up continuously with no upper bound.
#include <stdio.h>
#include <windows.h>
#include <assert.h>
#include <Iphlpapi.h>
#pragma comment(lib,"Iphlpapi.lib")
void testLeak() {
static unsigned char buf[16384];
DWORD dwSize(sizeof(buf));
if (GetIpAddrTable((PMIB_IPADDRTABLE)buf, &dwSize, false) == ERROR_INSUFFICIENT_BUFFER)
{
assert(0); // we never hit this branch.
return;
}
}
int main(int argc, char* argv[]) {
for ( int i = 0; true; i++ ) {
testLeak();
printf("i=%d\n",i);
Sleep(1000);
}
return 0;
}
#Stabledog:
I've ran your example, unmodified, for 24 hours but did not observe that the program's Commit Size increased indefinitely. It always stayed below 1024 kilobyte. This was on Windows 7 (32-bit, and without Service Pack 1).
Just for the sake of completeness, what happens to memory usage if you comment out the entire if block and the sleep? If there's no leak there, then I would suggest you're correct as to what's causing it.
Worst case, report it to MS and see if they can fix it - you have a nice simple test case to work from which is more than what I see in most bug reports.
Another thing you may want to try is to check the error code against NO_ERROR rather than a specific error condition. If you get back a different error than ERROR_INSUFFICIENT_BUFFER, there may be a leak for that:
DWORD dwRetVal = GetIpAddrTable((PMIB_IPADDRTABLE)buf, &dwSize, false);
if (dwRetVal != NO_ERROR) {
printf ("ERROR: %d\n", dwRetVal);
}
I've been all over this issue now: it appears that there is no acknowledgment from Microsoft on the matter, but even a trivial application grows without bounds on Windows 7 (not XP, though) when calling any of the APIs which retrieve the local IP addresses.
So the way I solved it -- for now -- was to launch a separate instance of my app with a special command-line switch that tells it "retrieve the IP addresses and print them to stdout". I scrape stdout in the parent app, the child exits and the leak problem is resolved.
But it wins "dang ugly solution to an annoying problem", at best.

Resources