How to make pthread_cond_timedwait() robust against system clock manipulations? - time

Consider the following source code, which is fully POSIX compliant:
#include <stdio.h>
#include <limits.h>
#include <stdint.h>
#include <stdlib.h>
#include <pthread.h>
#include <sys/time.h>
int main (int argc, char ** argv) {
pthread_cond_t c;
pthread_mutex_t m;
char printTime[UCHAR_MAX];
pthread_mutex_init(&m, NULL);
pthread_cond_init(&c, NULL);
for (;;) {
struct tm * tm;
struct timeval tv;
struct timespec ts;
gettimeofday(&tv, NULL);
printf("sleep (%ld)\n", (long)tv.tv_sec);
sleep(3);
tm = gmtime(&tv.tv_sec);
strftime(printTime, UCHAR_MAX, "%Y-%m-%d %H:%M:%S", tm);
printf("%s (%ld)\n", printTime, (long)tv.tv_sec);
ts.tv_sec = tv.tv_sec + 5;
ts.tv_nsec = tv.tv_usec * 1000;
pthread_mutex_lock(&m);
pthread_cond_timedwait(&c, &m, &ts);
pthread_mutex_unlock(&m);
}
return 0;
}
Prints the current system date every 5 seconds, however, it does a sleep of 3 seconds between getting the current system time (gettimeofday) and the condition wait (pthread_cond_timedwait).
Right after it is printing "sleep (...)", try setting the system clock two days into the past. What happens? Well, instead of waiting 2 more seconds on the condition as it usually does, pthread_cond_timedwait now waits for two days and 2 seconds.
How do I fix that?
How can I write POSIX compliant code, that does not break when the user manipulates the system clock?
Please keep in mind that the system clock might change even without user interaction (e.g. a NTP client might update the clock automatically once a day). Setting the clock into the future is no problem, it will only cause the sleep to wake up early, which is usually no problem and which you can easily "detect" and handle accordingly, but setting the clock into the past (e.g. because it was running in the future, NTP detected that and fixed it) can cause a big problem.
PS:
Neither pthread_condattr_setclock() nor CLOCK_MONOTONIC exists on my system. Those are mandatory for the POSIX 2008 specification (part of "Base") but most systems still only follow the POSIX 2004 specification as of today and in the POSIX 2004 specification these two were optional (Advanced Realtime Extension).

Interesting, I've not encountered that behaviour before but, then again, I'm not in the habit of mucking about with my system time that much :-)
Assuming you're doing that for a valid reason, one possible (though kludgy) solution is to have another thread whose sole purpose is to periodically kick the condition variable to wake up any threads so affected.
In other words, something like:
while (1) {
sleep (10);
pthread_cond_signal (&condVar);
}
Your code that's waiting for the condition variable to be kicked should be checking its predicate anyway (to take care of spurious wakeups) so this shouldn't have any real detrimental effect on the functionality.
It's a slight performance hit but once every ten seconds shouldn't be too much of a problem. It's only really meant to take care of the situations where (for whatever reason) your timed wait will be waiting a long time.
Another possibility is to re-engineer your application so that you don't need timed waits at all.
In situations where threads need to be woken for some reason, it's invariably by another thread which is perfectly capable of kicking a condition variable to wake one (or broadcasting to wake the lot of them).
This is very similar to the kicking thread I mentioned above but more as an integral part of your architecture than a bolt-on.

You can defend your code against this problem. One easy way is to have one thread whose sole purpose is to watch the system clock. You keep a global linked list of condition variables, and if the clock watcher thread sees a system clock jump, it broadcasts every condition variable on the list. Then, you simply wrap pthread_cond_init and pthread_cond_destroy with code that adds/removes the condition variable to/from the global linked list. Protect the linked list with a mutex.

Related

Is there a race between starting and seeing yourself in WinApi's EnumProcesses()?

I just found this code in the wild:
def _scan_for_self(self):
win32api.Sleep(2000) # sleep to give time for process to be seen in system table.
basename = self.cmdline.split()[0]
pids = win32process.EnumProcesses()
if not pids:
UserLog.warn("WindowsProcess", "no pids", pids)
for pid in pids:
try:
handle = win32api.OpenProcess(
win32con.PROCESS_QUERY_INFORMATION | win32con.PROCESS_VM_READ,
pywintypes.FALSE, pid)
except pywintypes.error, err:
UserLog.warn("WindowsProcess", str(err))
continue
try:
modlist = win32process.EnumProcessModules(handle)
except pywintypes.error,err:
UserLog.warn("WindowsProcess",str(err))
continue
This line caught my eye:
win32api.Sleep(2000) # sleep to give time for process to be seen in system table.
It suggests that if you call EnumProcesses() too fast after starting, you won't see yourself. Is there any truth to this?
There is a race, but it's not the race the code tried to protect against.
A successful call to CreateProcess returns only after the kernel object representing the process has been created and enqueued into the kernel's process list. A subsequent call to EnumProcesses accesses the same list, and will immediately observe the newly created process object.
That is, unless the process object has since been destroyed. This isn't entirely unusual since processes in Windows are initialized in-process. The documentation even makes note of that:
Note that the function returns before the process has finished initialization. If a required DLL cannot be located or fails to initialize, the process is terminated.
What this means is that if a call to EnumProcesses immediately following a successful call to CreateProcess doesn't observe the newly created process, it does so because it was late rather than early. If you are late already then adding a delay will only make you more late.
Which swiftly leads to the actual race here: Process IDs uniquely identify processes only for a finite time interval. Once a process object is gone, its ID is up for grabs, and the system will reuse it at some point. The only reliable way to identify a process is by holding a handle to it.
Now it's anyone's guess what the author of _scan_for_self was trying to accomplish. As written, the code takes more time to do something that's probably altogether wrong1 anyway.
1 Turns out my gut feeling was correct. This is just your average POSIX developer, that, in the process of learning that POSIX is insufficient would rather call out Microsoft instead of actually using an all-around superior API.
The documentation for EnumProcesses (WIn32 API - EnumProcesses function), does not mention anything about a delay needed to see the current process in the list it returns.
The example from Microsoft how to use EnumProcess to enumerate all running processes (Enumerating All Processes), also does not contain any delay before calling EnumProcesses.
A small test application I created in C++ (see below) always reports that the current process is in the list (tested on Windows 10):
#include <Windows.h>
#include <Psapi.h>
#include <iostream>
#include <vector>
const DWORD MAX_NUM_PROCESSES = 4096;
DWORD aProcesses[MAX_NUM_PROCESSES];
int main(void)
{
// Get the list of running process Ids:
DWORD cbNeeded;
if (!EnumProcesses(aProcesses, MAX_NUM_PROCESSES * sizeof(DWORD), &cbNeeded))
{
return 1;
}
// Check if current process is in the list:
DWORD curProcId = GetCurrentProcessId();
bool bFoundCurProcId{ false };
DWORD numProcesses = cbNeeded / sizeof(DWORD);
for (DWORD i=0; i<numProcesses; ++i)
{
if (aProcesses[i] == curProcId)
{
bFoundCurProcId = true;
}
}
std::cout << "bFoundCurProcId: " << bFoundCurProcId << std::endl;
return 0;
}
Note: I am aware that the fact that the program reported the expected result does not mean that there is no race. Maybe I just couldn't catch it manifest. But trying to run code like that can give you a hint sometimes (especially if the result would have been that there is a race).
The fact that I never had a problem running this test (did it many times), together with the lack of any mention of the need for a delay in Microsoft's documentation make me believe that it is not required.
My conclusion is that either:
There is a unique issue when using it from python (doubt it).
or:
The code you found is doing something unnecessary.
There is no race.
EnumProcesses calls a NT API function that switches to kernel mode to walk the linked list of processes. Your own process has been added to the list before it starts running.

Slow thread creation on Windows

I have upgraded a number crunching application to a multi-threaded program, using the C++11 facilities. It works well on Mac OS X but does not benefit from multithreading on Windows (Visual Studio 2013). Using the following toy program
#include <iostream>
#include <thread>
void t1(int& k) {
k += 1;
};
void t2(int& k) {
k += 1;
};
int main(int argc, const char *argv[])
{
int a{ 0 };
int b{ 0 };
auto start_time = std::chrono::high_resolution_clock::now();
for (int i = 0; i < 10000; ++i) {
std::thread thread1{ t1, std::ref(a) };
std::thread thread2{ t2, std::ref(b) };
thread1.join();
thread2.join();
}
auto end_time = std::chrono::high_resolution_clock::now();
auto time_stack = std::chrono::duration_cast<std::chrono::microseconds>(
end_time - start_time).count();
std::cout << "Time: " << time_stack / 10000.0 << " micro seconds" <<
std::endl;
std::cout << a << " " << b << std::endl;
return 0;
}
I have discovered that it takes 34 microseconds to start a thread on Mac OS X and 340 microseconds to do the same on Windows. Am I doing something wrong on the Windows side ? Is it a compiler issue ?
Not a compiler problem (nor an operating system problem, strictly speaking).
It is a well-known fact that creating threads is an expensive operation. This is especially true under Windows (used to be true under Linux prior to clone as well).
Also, creating and joining a thread is necessarily slow and does not tell a lot about creating a thread as such. Joining presumes that the thread has exited, which can only happen after it has been scheduled to run. Thus, your measurements include delays introduced by scheduling. Insofar, the times you measure are actually pretty good (they could easily be 20 times longer!).
However, it does not matter a lot whether spawning threads is slow anyway.
Creating 20,000 threads like in your benchmark in a real program is a serious error. While it is not strictly illegal or disallowed to create thousands (even millions) of threads, the "correct" way of using threads is to create no more threads than there are approximately CPU cores. One does not create very short-lived threads all the time either.
You might have a few short-lived ones, and you might create a few extra threads (which e.g. block on I/O), but you will not want to create hundreds or thousands of these. Every additional thread (beyond the number of CPU cores) means more context switches, more scheduler work, more cache pressure, and 1MB of address space and 64kB of physical memory gone per thread (due to stack reserve and commit granularity).
Now, assume you create for example 10 threads at program start, it does not matter at all whether this takes 3 milliseconds alltogether. It takes several hundred milliseconds (at least) for the program to start up anyway, nobody will notice a difference.
Visual C++ uses Concurrency Runtime (MS specific) to implement std.thread features. When you directly call any Concurrency Runtime feature/function, it creates a default runtime object (not going into details). Or, when you call std.thread function, it does the same as of ConcRT function was invoked.
The creation of default runtime (or say, scheduler) takes sometime, and hence it appear to be taking sometime. Try creating a std::thread object, let it run; and then execute the benching marking code (whole of above code, for example).
EDIT:
Skim over it - http://www.codeproject.com/Articles/80825/Concurrency-Runtime-in-Visual-C
Do Step-Into debugging, to see when CR library is invoked, and what it is doing.

Why is usleep not working at boot time?

I have a daemon that launchd runs at system boot (OS X). I need to delay startup of my daemon by 3-5 seconds, yet the following code executes instantly at boot, but delays properly well after boot:
#include <unistd.h>
...
printf("Before delay\n");
unsigned int delay = 3000000;
while( (delay=usleep(delay)) > 0)
{
;
}
printf("After delay\n");
If I run it by hand after the system has started, it delays correctly. If I let launchd start it at boot the console log shows that there is no delay between Before delay and After delay - they are executed in the same second.
If I could get launchd to execute my daemon after a delay after boot that would be fine as well, but my reading suggests that this isn't possible (perhaps I'm wrong?).
Otherwise, I need to understand why usleep isn't working, and what I can do to fix it, or what delay I might be able to use instead that works that early in the boot process.
First things first. Put in some extra code to also print out the current time, rather than relying on launchd to do it.
It's possible that the different flushing behaviour for standard output may be coming into play.
If standard output can be determined to be an interactive device (such as running it from the command line), it is line buffered - you'll get the "before" line flushed before the delay.
Otherwise, it's fully buffered so the flush may not happen until the program exits (or you reach the buffer size of (for example) 4K. That means that launchd may see the lines come out together, both after the delay.
Getting the C code to timestamp the lines will tell you if ths is the problems, something like:
#include <stdio.h>
#include <time.h>
#include <unistd.h>
int main (void) {
printf("%d: Before delay\n", time(0));
unsigned int delay = 3000000;
while( (delay=usleep(delay)) > 0);
printf("%d: After delay\n", time(0));
return 0;
}
To see why the buffering may be a problem, consider running that program above as follows:
pax> ./testprog | while read; do echo $(date): $REPLY; done
Tue Jan 31 12:59:24 WAST 2012: 1327985961: Before delay
Tue Jan 31 12:59:24 WAST 2012: 1327985964: After delay
You can see that, because the buffering causes both lines to appear to the while loop when the program exits, they get the same timestamp of 12:59:24 despite the fact they were generated three seconds apart within the program.
In fact, if you change it as follows:
pax> ./testprog | while read; do echo $(date) $REPLY; sleep 10 ; done
Tue Jan 31 13:03:17 WAST 2012 1327986194: Before delay
Tue Jan 31 13:03:27 WAST 2012 1327986197: After delay
you can see the time seen by the "surrounding" program (the while loop or, in your case, launchd) is totally disconnected from the program itself).
Secondly, usleep is a function that can fail! And it can fail by returning -1, which is very much not greater than zero.
That means, if it fails, your delay will be effectively nothing.
The Single UNIX Specification states, for usleep:
On successful completion, usleep() returns 0. Otherwise, it returns -1 and sets errno to indicate the error.
The usleep() function may fail if: [EINVAL]: The time interval specified 1,000,000 or more microseconds.
That's certainly the case with your code although it would be hard to explain why it works after boot and not before.
Interestingly, the Mac OSX docs don't list EINVAL but they do allow for EINTR if the sleep is interrupted externally. So again, something you should check.
You can check those possibilities with something like:
#include <stdio.h>
#include <time.h>
#include <errno.h>
#include <unistd.h>
int main (void) {
printf("%d: Before delay\n", time(0));
unsigned int delay = 3000000;
while( (delay=usleep(delay)) > 0);
printf("%d: After delay\n", time(0));
printf("Delay became %d, errno is %d\n", delay, errno);
}
One other thing I've just noticed, from your code you seem to be assuming that usleep returns the number of microseconds unslept (remaining) and you loop until it's all done, but that behaviour is not borne out by the man pages.
I know that nanosleep does this (by updating the passed structure to contain the remaining time rather than returning it) but usleep only returns 0 or -1.
The sleep function acts in that manner, returning the number of seconds yet to go. Perhaps you might look into using that function instead, if possible.
In any case, I would still run that (last) code segment above just so you can ascertain what the actual problem is.
According to the old POSIX.1 standard, and as documented in the OSX manual page, usleep returns 0 on success and -1 on error.
If you get an error it's most likely EINTR (the only error documented in the OSX manual page) meaning it has been interrupted by a signal. You better check errno to be certain though. As a side-note, on the Linux manual page it states that you can get EINVAL too in some cases:
usec is not smaller than 1000000. (On systems where that is considered an error.)
As another side-note, usleep has been obseleted in the latest POSIX.1 standard, in favor of nanosleep.

GetIpAddrTable() leaks memory. How to resolve that?

On my Windows 7 box, this simple program causes the memory use of the application to creep up continuously, with no upper bound. I've stripped out everything non-essential, and it seems clear that the culprit is the Microsoft Iphlpapi function "GetIpAddrTable()". On each call, it leaks some memory. In a loop (e.g. checking for changes to the network interface list), it is unsustainable. There seems to be no async notification API which could do this job, so now I'm faced with possibly having to isolate this logic into a separate process and recycle the process periodically -- an ugly solution.
Any ideas?
// IphlpLeak.cpp - demonstrates that GetIpAddrTable leaks memory internally: run this and watch
// the memory use of the app climb up continuously with no upper bound.
#include <stdio.h>
#include <windows.h>
#include <assert.h>
#include <Iphlpapi.h>
#pragma comment(lib,"Iphlpapi.lib")
void testLeak() {
static unsigned char buf[16384];
DWORD dwSize(sizeof(buf));
if (GetIpAddrTable((PMIB_IPADDRTABLE)buf, &dwSize, false) == ERROR_INSUFFICIENT_BUFFER)
{
assert(0); // we never hit this branch.
return;
}
}
int main(int argc, char* argv[]) {
for ( int i = 0; true; i++ ) {
testLeak();
printf("i=%d\n",i);
Sleep(1000);
}
return 0;
}
#Stabledog:
I've ran your example, unmodified, for 24 hours but did not observe that the program's Commit Size increased indefinitely. It always stayed below 1024 kilobyte. This was on Windows 7 (32-bit, and without Service Pack 1).
Just for the sake of completeness, what happens to memory usage if you comment out the entire if block and the sleep? If there's no leak there, then I would suggest you're correct as to what's causing it.
Worst case, report it to MS and see if they can fix it - you have a nice simple test case to work from which is more than what I see in most bug reports.
Another thing you may want to try is to check the error code against NO_ERROR rather than a specific error condition. If you get back a different error than ERROR_INSUFFICIENT_BUFFER, there may be a leak for that:
DWORD dwRetVal = GetIpAddrTable((PMIB_IPADDRTABLE)buf, &dwSize, false);
if (dwRetVal != NO_ERROR) {
printf ("ERROR: %d\n", dwRetVal);
}
I've been all over this issue now: it appears that there is no acknowledgment from Microsoft on the matter, but even a trivial application grows without bounds on Windows 7 (not XP, though) when calling any of the APIs which retrieve the local IP addresses.
So the way I solved it -- for now -- was to launch a separate instance of my app with a special command-line switch that tells it "retrieve the IP addresses and print them to stdout". I scrape stdout in the parent app, the child exits and the leak problem is resolved.
But it wins "dang ugly solution to an annoying problem", at best.

MSG::time is later than timeGetTime

After noticing some timing descrepencies with events in my code, I boiled the problem all the way down to my Windows Message Loop.
Basically, unless I'm doing something strange, I'm experiencing this behaviour:-
MSG message;
while (PeekMessage(&message, _applicationWindow.Handle, 0, 0, PM_REMOVE))
{
int timestamp = timeGetTime();
bool strange = message.time > timestamp; //strange == true!!!
TranslateMessage(&message);
DispatchMessage(&message);
}
The only rational conclusion I can draw is that MSG::time uses a different timing mechanism then timeGetTime() and therefore is free to produce differing results.
Is this the case or am i missing something fundemental?
Could this be a signed unsigned issue? You are comparing a signed int (timestamp) to an unsigned DWORD (msg.time).
Also, the clock wraps every 40ish days - when that happens strange could well be true.
As an aside, if you don't have a great reason to use timeGetTime, you can use GetTickCount here - it saves you bringing in winmm.
The code below shows how you should go about using times - you should never compare the times directly, because clock wrapping messes that up. Instead you should always subtract the start time from the current time and look at the interval.
// This is roughly equivalent code, however strange should never be true
// in this code
DWORD timestamp = GetTickCount();
bool strange = (timestamp - msg.time < 0);
I don't think it's advisable to expect or rely on any particular relationship between the absolute values of timestamps returned from different sources. For one thing, the multimedia timer may have a different resolution from the system timer. For another, the multimedia timer runs in a separate thread, so you may encounter synchronisation issues. (I don't know if each CPU maintains its own independent tick count.) Furthermore, if you are running any sort of time synchronisation service, it may be making its own adjustments to your local clock and affecting the timestamps you are seeing.
Are you by any chance running an AMD dual core? There is an issue where since each core has a separate timer and can run at different speeds, the timers can diverge from each other. This can manifest itself in negative ping times, for example.
I had similar issues when measuring timeouts in different threads using GetTickCount().
Install this driver (IIRC) to resolve the issue.
MSG.time is based on GetTickCount(), and timeGetTime() uses the multimedia timer, which is completely independent of GetTickCount(). I would not be surprised to see that one timer has 'ticked' before the other.

Resources