Early wakeups in WaitForSingleObject() ...? - windows

Everything I've read both on the MS docs site (where it's not really addressed) and here in SO, says that Windows WaitForSingleObject() is not subject to spurious wakeups and it waits for at least the provided time, and maybe longer. However, my testing says this is not true and in fact early wakeups almost always happen. Is the "common wisdom" wrong and I just need to add loops to handle early wakeups, or am I doing something wrong and I need to keep banging my head on this to try to figure out?
Unfortunately the full code is too complex to post here, but I have two different threads each with their own event, created via:
event = CreateEventA(NULL, false, false, NULL);
(event is a thread-local variable). I have a mutex I use to ensure that both threads start running at about the same time.
In each thread I call WaitForSingleObject(). In this specific test, I never call SetEvent() so the only way to finish is via timeout, and the return code shows that's what happens. However, the actual amount of time spent waiting is massively variable and 90% of the time is less than the time I requested. I've instrumented this using QueryPerformanceCounter() to detect how long is spent here and it's just wrong. Here's the instrumented code:
LARGE_INTEGER freq, ctr1, ctr2;
QueryPerformanceFrequency(&freq);
QueryPerformanceCounter(&ctr1);
DWORD ret = WaitForSingleObject(event, tmoutMs);
QueryPerformanceCounter(&ctr2);
uint64_t elapsed = ((uint64_t)ctr2.QuadPart - (uint64_t)ctr1.QuadPart) * 1000000ULL / (uint64_t)freq.QuadPart;
(here elapsed is kept in microseconds, just to be a bit more specific)
Then I print this info out. In one thread tmoutMs is 2, and in the other thread tmoutMs is 100. Almost every time the returned values are too short: the 2ms wait can take anywhere from 700us up, and the 100ms wait takes from about 93ms up. Only once in 7 tries or so will the elapsed time be >100ms. Here are some sample outputs:
event=104: pause(tmoutMs=2) => ret=258 elapsed us=169, ms=0
event=112: pause(tmoutMs=100) => ret=258 elapsed us=93085, ms=93
event=104: pause(tmoutMs=2) => ret=258 elapsed us=427, ms=0
event=112: pause(tmoutMs=100) => ret=258 elapsed us=94002, ms=94
event=104: pause(tmoutMs=2) => ret=258 elapsed us=3317, ms=3
event=112: pause(tmoutMs=100) => ret=258 elapsed us=96840, ms=96
event=104: pause(tmoutMs=2) => ret=258 elapsed us=11461, ms=11
event=112: pause(tmoutMs=100) => ret=258 elapsed us=105189, ms=105
The return code is always WAIT_TIMEOUT as expected.
Is this reasonable, even though it's not documented (or is it documented somewhere that I just can't find), and I just have to loop on my own to handle early wakeups?
FWIW, this is a C++ program compiled with Visual Studio 2017 running on Windows10. It's a unit test program using Google Test, and has no graphical interface: it's command-line only.

Related

If condition is skipped on multiple threads? (JMeter)

After a long period of time of reading, It's my first post here. :)
My question is the following:
Using JMeter, I have to execute 10000 requests, but between every 1000 of them, I should have sleep time (from 0 to 1000 => sleep time => from 1000 to 2000 => sleep time => ...).
I was able to do that using if clause and '__counter(FALSE,)' with pause between every 1000 requests, but it's working only on one thread. If I set >1 threads, it skipping if clause and sleep time is not activated. Far as I know, first parameter of the "counter" function makes it "global" if it is FALSE, but I am confused why the if clause is skipped, if more than 1 thread is used.
I'm checking the counter with groovy func: ${__groovy("${__counter(FALSE,)}" == "1000")}
How do you know that the "sleep time is not activated"?
Your "sleep time" will be "activated" only once when the counter reaches 1000, on 2000 and so on the condition will not be met
Inlining JMeter functions or variables into Groovy scripts is not very recommended, consider switching to __jexl3() function and changing your expression to something like:
${__jexl3(${__counter(FALSE,)} % 1000 == 0,)}
Demo:
More information: 6 Tips for JMeter If Controller Usage

How to choose the right value for the expiryTime parameter for RedLockFactory.CreateLockAsync() method?

I am using RedLock.net library for resource locking. To lock a resoruce I am using RedLockFactory.CreateLockAsync.
public async Task<IRedLock> RedLockFactory.CreateLockAsync(string resource,
TimeSpan expiryTime,
TimeSpan waitTime,
TimeSpan retryTime,
CancellationToken? cancellationToken = null)
I understand that this method will attempt to acquire a lock for waitTime by keep retrying every retryTime. However I do not understand what would be the right value for expiryTime.
Once a lock has been acquired it will be kept until the lock is Disposed and that is irrespective of the expiryTime. In other words even if expirtyTime is set to 5 seconds if the lock is only diposed after 10 seconds then the lock will be kept for 10 seconds.
In many examples the value of 30 is used without explanation.
I have tested with a value of 0. A lock is not acquired at all.
I have tested with a value of 5 milliseconds. A lock is acquired and kept until disposed.
So how to choose the right value for the expiryTime parameter? It seems to me that this parameter is unnecessary and any non zero positive value is ok.
ExpiryTime determines the maximum time that a lock will be held in the case of a failure (say, the process holding the lock crashing). It also indirectly determines how often the lock is renewed while it is being held.
e.g.
If you set an expiry time of 10 minutes:
the automatic lock renewal timer will call out to redis every 5 minutes (expiry time / 2) to extend the lock
if your process crashes without releasing the lock, you will have to wait up to a maximum of 10 minutes until the key expires in redis and another process can take out a lock on the same resource
If you set an expiry time of 10 milliseconds:
the automatic lock renewal timer will call out to redis every 5 milliseconds (expiry time / 2) to extend the lock (which might be a little excessive)
if your process crashes without releasing the lock, you will have to wait up to a maximum of 10 milliseconds until the key expires in redis and another process can take out a lock on the same resource
It's a balance between how much time you're willing to wait for a lock to expire in the failure case vs how much load you put on your redis servers.

Spring Boot Reactive: elapsed time calculation

I am currently using Spring Boot Reactive (using webflux) to develop a microservice. In it, I implement some kind of elapsed time calculation to determine how long it took for the process to run.
Basically, when the process start, I initialize the current timestamp to mark as the start timestamp as follows:
...
metrics.setStartMillis(System.currentTimeMillis());
...
and then it will print whether the process success or not along with the elapsed time in doOnSuccess() and onErrorResume() respectively:
...
metrics.setStartMillis(System.currentTimeMillis());
return webclientAdapter.getResponse(request)
.doOnSuccess(success -> {
metrics.info(true); // This will print a metric log indicating a success process along with process time with current timestamp
})
.onErrorResume(error -> {
metrics.info(false); // This will print a metric log indicating a failed process along with process time with current timestamp
});
...
When testing the service by mocking the backend call with a 100ms delay using cUrl, the elapsed time manage to be printed correctly (~100ms elapsed time), however, during a load test using jmeter, the printed elapsed time become very fast (~0 - 20ms ish) although the service is configured to call the mock backend with 100ms delay.
Does this have to do with the nature of Reactive being in an event loop, and if so, how can I ensure that the calling process elapsed time is able to be calculated properly?
Pardon if there is any confusion, feel free to ask additional information.

Is it the Windows system time could retreat occasionally

Spring boot project, log how many time took to save 2 DB,
long start = System.currentTimeMillis();
getDao().batchInsert(batchList);
long end = System.currentTimeMillis();
log.info("Save {} data 2 DB successfully took time: {}", getDescName(), (end - start));
Very strangely, I found there is situation time cost is negativeļ¼Œsee below
2019-05-16 14:41:04.420 INFO 3324 --- [ave2db-thread-2] c.c.sz_vss.demo.AbstractSave2DBProcess : Save Stock data 2 DB batch size: 416
2019-05-16 14:41:03.152 INFO 3324 --- [ave2db-thread-2] c.c.sz_vss.demo.AbstractSave2DBProcess : Save Stock data 2 DB successfully took time: -1268
Why does this happen? Is it spring boot log system bug? or is it the Windows system time could retreat occasionally?
Network time synchronization can apply a correction in either direction, so yes the system calendar can move backwards. Also in timezones that observe daylight savings time, you will see +/- 1 hour discontinuities each year.
That's why it is not recommended to use the system calendar for measuring elapsed time. There are monotonic timers (on Windows, QueryPerformanceCounter() combined with QueryPerformanceFrequency(), on POSIX such as Linux it is clock_gettime(CLOCK_MONOTONIC)). Managed frameworks usually wrap these in an object named "Stopwatch".

Visual Studio PerfTips elapsed time is different to the time from StopWatch

I use Visual Studio 2017 to debug my code and leverage PerfTips to get the rough elapsed time of a function call.
But I just found big difference between perftips time and the time from StopWatch().
Example:
var sw=StopWatch();
sw.Start();
MyFunction();
sw.Stop();
I set break point before and after the MyFunction() call, PerfTips shows the elapsed time of MyFunction() call is around 260 ms.
But the sw.Elapsed.TotalMillionSeconds value is > 1000 ms. Why so big difference?
Anything wrong of my StopWatch usage or perftips?
BTW: I check the stopwatch time value in debugger mode, that is, set break point on sw.Stop, and read the value on debugger window. Is it incorrect way to get the accurate StopWatch() value?

Resources