std::condition_variable::wait_for exits immediately when given std::chrono::duration::max - c++11

I have a wrapper around std::queue using C++11 semantics to allow concurrent access. The std::queue is protected with a std::mutex. When an item is pushed to the queue, a std::condition_variable is notified with a call to notify_one.
There are two methods for popping an item from the queue. One method will block indefinitely until an item has been pushed on the queue, using std::condition_variable::wait(). The second will block for an amount of time given by a std::chrono::duration unit using std::condition_variable::wait_for():
template <typename T> template <typename Rep, typename Period>
void ConcurrentQueue<T>::Pop(T &item, std::chrono::duration<Rep, Period> waitTime)
{
std::cv_status cvStatus = std::cv_status::no_timeout;
std::unique_lock<std::mutex> lock(m_queueMutex);
while (m_queue.empty() && (cvStatus == std::cv_status::no_timeout))
{
cvStatus = m_pushCondition.wait_for(lock, waitTime);
}
if (cvStatus == std::cv_status::no_timeout)
{
item = std::move(m_queue.front());
m_queue.pop();
}
}
When I call this method like this on an empty queue:
ConcurrentQueue<int> intQueue;
int value = 0;
std::chrono::seconds waitTime(12);
intQueue.Pop(value, waitTime);
Then 12 seconds later, the call to Pop() will exit. But if waitTime is instead set to std::chrono::seconds::max(), then the call to Pop() will exit immediately. The same occurs for milliseconds::max() and hours::max(). But, days::max() works as expected (doesn't exit immediately).
What causes seconds::max() to exit right away?
This is compiled with mingw64:
g++ --version
g++ (rev5, Built by MinGW-W64 project) 4.8.1
Copyright (C) 2013 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

To begin with, the timed wait should likely be a wait_until(lock, std::chrono::steady_clock::now() + waitTime);, not wait_for because the loop will now simply repeat the wait multiple times until finally the condition (m_queue.empty()) becomes true. The repeats can also be caused by spurious wake-ups.
Fix that part of the code by using the predicated wait methods:
template <typename Rep, typename Period>
bool pop(std::chrono::duration<Rep, Period> waitTime, int& popped)
{
std::unique_lock<std::mutex> lock(m_queueMutex);
if (m_pushCondition.wait_for(lock, waitTime, [] { return !m_queue.empty(); }))
{
popped = m_queue.back();
m_queue.pop_back();
return true;
} else
{
return false;
}
}
On my implementation at least seconds::max() yields 0x7fffffffffffffff
§30.5.1 ad 26 states:
Effects: as if
return wait_until(lock, chrono::steady_clock::now() + rel_time);
Doing
auto time = steady_clock::now() + seconds::max();
std::cout << std::dec << duration_cast<seconds>(time.time_since_epoch()).count() << "\n";
On my system, prints
265521
Using date --date='#265521' --rfc-822 told me that that is Sun, 04 Jan 1970 02:45:21 +0100
There's a wrap around bug going on for GCC and Clang, see below
Tester
Live On Coliru
#include <thread>
#include <condition_variable>
#include <iostream>
#include <deque>
#include <chrono>
#include <iomanip>
std::mutex m_queueMutex;
std::condition_variable m_pushCondition;
std::deque<int> m_queue;
template <typename Rep, typename Period>
bool pop(std::chrono::duration<Rep, Period> waitTime, int& popped)
{
std::unique_lock<std::mutex> lock(m_queueMutex);
if (m_pushCondition.wait_for(lock, waitTime, [] { return !m_queue.empty(); }))
{
popped = m_queue.back();
m_queue.pop_back();
return true;
} else
{
return false;
}
}
int main()
{
int data;
using namespace std::chrono;
pop(seconds(2) , data);
std::cout << std::hex << std::showbase << seconds::max().count() << "\n";
auto time = steady_clock::now() + seconds::max();
std::cout << std::dec << duration_cast<seconds>(time.time_since_epoch()).count() << "\n";
pop(seconds::max(), data);
}

The reason for the problem is this nasty bit in the description for rel_time parameter:
Note that rel_time must be small enough not to overflow when added to std::chrono::steady_clock::now().
So when you do m_pushCondition.wait_for(lock, std::chrono::seconds::max()); the parameter overflows inside the function. In fact, if you enable undefined sanitizer, (e.g. -fsanitize=undefined option for GCC and Clang), and run the app, you may see the following runtime warning:
/usr/include/c++/9.1.0/chrono:456:34: runtime error: signed integer overflow: 473954758945968 + 9223372036854775807 cannot be represented in type 'long int'
Worth noting though that for some reason I did not have this warning for the actual app I was working on, probably a sanitizer bug. Anyway.
So what you can do. First: do not try to work around that by simply using the wait_for() overload with predicate because you gonna make yourself a bad spinlock burning your CPU core. Second: substracting max() - now() doesn't seem to work because it changes the type.
One way to work that around is using conditionally condition_variable::wait() and condition_variable::wait_for().
Another one may be to just declare declare big timespan, and use it. E.g.:
// This is a replacement to chrono::seconds::max(). The latter doesn't work with
// `wait_for` call because its `rel_time` parameter description has the following
// sentence: "Note that rel_time must be small enough not to overflow when added to
// std::chrono::steady_clock::now()".
const chrono::seconds many_hours = 99h;
// …[snip]…
m_pushCondition.wait_for(lock, many_hours);
// …[snip]…
You probably can tolerate a "spurious" wakeup once a 99 hours :)

Related

boost asio: Is it thread safe to call tcp::socket::async_read_some() when handler is protected by a strand

I'm struggle to full understand Boost ASIO and strands. I was under the impression that the call to socket::async_read_some() was safe as long as the handler was wrapped in a strand. This appears not to be the case since the code eventually throws an exception.
In my situation a third party library is making the Session::readSome() calls. I'm using a reactor pattern with the ASIO layer under the third party library. When data arrives on the socket the 3rd party is called to do the read. The pattern is used since it is necessary to abort the read operation at any time and have the 3rd party library error out and return its thread. The third party expected a blocking read so the code mimics it with a conditional variable.
Given the example below what is the proper way to do this? Do I need to wrap the async_read_some() call in a dispatch() or post() so it runs through a strand too?
Note: Compiler is c++14 ;-(
Example representative code:
Session::Session (ba::io_context& ioContext):
m_sessionStrand ( ioContext.get_executor() ),
m_socket ( m_sessionStrand )
{}
int32_t Session::readSome (unsigned char* pBuffer, uint32_t bufferSizeToRead, boost::system::error_code& errorCode)
{
// The 3d party expects a synchronous read so we mimic the behavior
// with a async_read and then wait for the results. With this pattern
// we can unblock the read elsewhere - for or example calling close on the socket -
// and still give the 3d party the illusion of a synchronous read.
// In such a cases the 3rd party will receive an error code
// on the read and return it's thread.
// Nothing to do
if ( bufferSizeToRead == 0) return 0;
// Create a mutable buffer
ba::mutable_buffer buffer (pBuffer, bufferSizeToRead);
std::size_t result = 0;
errorCode.clear();
// Setup conditional
m_readerPause.exchange(true);
auto readHandler = [&result, &errorCode, self=shared_from_this()](boost::system::error_code ec, std::size_t bytesRead)
{
result = bytesRead;
errorCode = ec;
// Signal that we got results
std::unique_lock<std::mutex> lock{m_readerMutex};
m_readerPause.exchange(false);
m_readerPauseCV.notify_all();
};
m_socket.async_read_some(buffer, ba::bind_executor (m_sessionStrand, readHandler));
// We pause the 3rd party read thread until we get the read results back - or an error occurs
{
std::unique_lock<std::mutex> lock{m_readerMutex};
m_readerPauseCV.wait (lock, [this]{ return !m_readerPause.load(std::memory_order_acquire); } );
}
return result;
}
The exception occurs in epoll_reactor.ipp. There is a race condition between the read and closing the socket.
void epoll_reactor::start_op(int op_type, socket_type descriptor,
epoll_reactor::per_descriptor_data& descriptor_data, reactor_op* op,
bool is_continuation, bool allow_speculative)
{
if (!descriptor_data)
{
op->ec_ = boost::asio::error::bad_descriptor;
post_immediate_completion(op, is_continuation);
return;
}
mutex::scoped_lock descriptor_lock(descriptor_data->mutex_);
if (descriptor_data->shutdown_) //!! SegFault here: descriptor_data == NULL*
{
post_immediate_completion(op, is_continuation);
return;
}
...
}
Thanks in advance for any insights in the proper way to handle this situation using ASIO.
The strand doesn't "protect" the handler. Instead, it protects some shared state (which you control) by synchronizing handler execution. It's exactly like a mutex for async execution.
According to this logic all code running on the strand can touch the shared resources, and conversely, code not guaranteed to be on the strand can not be allowed to touch them.
In your code, the shared resources consist of at least buffer, result, m_socket. It would be more complete to include the m_sessionStrand, m_readerPauseCV, m_readerMutex, m_readerPause but all of these are implicitly threadsafe the way they are used¹.
Your code looks to do things safely in these regards. However it makes a few unfortunate detours that make it harder than necessary to check/reason about the code:
it uses more (local) shared state to communicate results from the handler
it doesn't make explicit what the mutex and/or the strand protect
it employs both a mutex and a strand which conceptually compete for the same responsibility
it employs both a condition and an atomic bool, which again compete for the same responsibility
it does manual strand binding, which muddies the expectations about what the native executor for the m_socket object is expected to be
the initial read is not protected. This means that if Session::readSome is invoked from a "wild" thread, it will use member functions without synchronizing with any other operations that may be pending on the m_socket.
the atomic_bool mutations are spelled in Very Convoluted Ways(TM), which serve to show you (presumably) understand the memory model, but make the code harder to review without tangible merit. Clearly, the blocking synchronization will (far) outweigh any benefit of explicit memory acquisition order. I suggest to at least "normalize" the spelling as atomic_bool was explicitly designed to afford:
//m_readerPause.exchange(true);
m_readerPause = true;
and
m_readerPauseCV.wait(lock, [this] { return !m_readerPause; });
since you are emulating blocking IO, there is no merit capturing shared_from_this() in the lambda. Lifetime should be guaranteed by the calling party any ways.
Interestingly, you didn't show this capture, which is required for the lambda to compile, assuming you didn't use global variables.
Kudos for explicitly clearing the error_code output variable. This is oft forgotten. Technically, you did forget about with the (questionable?) early exit when (bufferSizeToRead == 0)... You might have a slightly unorthodox caller contract where this makes sense.
To be generic I'd suggest to perform the zero-length read as it might behave differently depending on the transport connected.
Last, but not least, m_socket.[async_]read_some is rarely what you require on application protocol level. I'll leave this one to you, as you might have this exceptional edge-case scenario.
Simplifying
Conceptually, I'd like to write:
int32_t Session::readSome(unsigned char* buf, uint32_t size, error_code& ec) {
ec.clear();
size_t result = 0;
std::tie(ec, result) = m_socket
.async_read_some(ba::buffer(buf, size),
ba::as_tuple(ba::use_future))
.get();
return result;
}
This uses futures to get the blocking behaviour while being cancelable. Sadly, contrary to expectation there is currently a limitation that prevents combining as_tuple and use_future.
So, we have to either ignore partial success scenarios (significant result when !ec):
int32_t Session::readSome(unsigned char* buf, uint32_t size, error_code& ec) try {
ec.clear();
return m_socket
.async_read_some(ba::buffer(buf, size), ba::use_future)
.get();
} catch (boost::system::system_error const& se) {
ec = se.code();
return 0;
}
I suspect that member-async_read_some doesn't have a partial success mode. However, let's still give it thought, seeing that I warned before that async_read_some is rarely what you need anyways:
int32_t Session::readSome(unsigned char* buf, uint32_t size, error_code& ec) {
std::promise<std::tuple<size_t, error_code> > p;
m_socket.async_read_some(ba::buffer(buf, size), [&p](error_code ec_, size_t n_) { p.set_value({n_, ec_}); });
size_t result;
std::tie(result, ec) = p.get_future().get();
return result;
}
Still considerably easier.
Interim Result
Self contained example with the current approach:
Live On Coliru
#include <boost/asio.hpp>
namespace ba = boost::asio;
using ba::ip::tcp;
using boost::system::error_code;
using CharT = /*unsigned*/ char; // for ease of output...
struct Session : std::enable_shared_from_this<Session> {
tcp::socket m_socket;
Session(ba::any_io_executor ex) : m_socket(make_strand(ex)) {
m_socket.connect({{}, 7878});
}
int32_t readSome(CharT* buf, uint32_t size, error_code& ec) {
std::promise<std::tuple<size_t, error_code>> p;
m_socket.async_read_some(ba::buffer(buf, size), [&p](error_code ec_, size_t n_) {
p.set_value({n_, ec_});
});
size_t result;
std::tie(result, ec) = p.get_future().get();
return result;
}
};
#include <iomanip>
#include <iostream>
int main() {
ba::thread_pool ioc;
auto s = std::make_shared<Session>(ioc.get_executor());
error_code ec;
CharT data[10];
while (auto n = s->readSome(data, 10, ec))
std::cout << "Received " << quoted(std::string(data, n)) << " (" << ec.message() << ")\n";
ioc.join();
}
Testing with
g++ -std=c++14 -O2 -Wall -pedantic -pthread main.cpp
for resp in FOO LONG_BAR_QUX_RESPONSE; do nc -tln 7878 -w 0 <<< $resp; done&
set -x
sleep .2; ./a.out
sleep .2; ./a.out
Prints
+ sleep .2
+ ./a.out
Received "FOO
" (Success)
+ sleep .2
+ ./a.out
Received "LONG_BAR_Q" (Success)
Received "UX_RESPONS" (Success)
Received "E
" (Success)
External Synchronization (Cancellation?)
Now, code not show implies that other operations may act on m_socket, if at least only to cancel operations in flight³. If this situation arises you have add the missing synchronization, either using the mutex or the strand.
I suggest not introducing the competing synchronization mechanism, even though not "incorrect". It will
lead to simpler code
allow you to solidify your understanding of the use of the strand.
So, let's make sure that the operation runs on the strand:
int32_t readSome(CharT* buf, uint32_t size, error_code& ec) {
std::promise<size_t> p;
post(m_socket.get_executor(), [&] {
m_socket.async_read_some(ba::buffer(buf, size),
[&](error_code ec_, size_t n_) { ec = ec_; p.set_value(n_); });
});
return p.get_future().get();
}
void cancel() {
post(m_socket.get_executor(),
[self = shared_from_this()] { self->m_socket.cancel(); });
}
See it Live On Coliru
Exercising Cancellation
int main() {
ba::thread_pool ioc(1);
auto s = std::make_shared<Session>(ioc.get_executor());
std::thread th([&] {
std::this_thread::sleep_for(5s);
s->cancel();
});
error_code ec;
CharT data[10];
do {
auto n = s->readSome(data, 10, ec);
std::cout << "Received " << quoted(std::string(data, n)) << " (" << ec.message() << ")\n";
} while (!ec);
ioc.join();
th.join();
}
Again, Live On Coliru
¹ Technically in a multi-thread situation you need to notify the CV under the lock to allow for fair scheduling, i.e. to prevent waiter starvation. However your scenario is so isolated that you can get away with being somewhat sloppy.
² by default tcp::socket type-erases the executor with any_io_executor, but you could use basic_stream_socket<tcp, strand<io_context::executor_type> > to remove that cost if your executor type is statically known
³ Of course, POSIX sockets include full duplex scenarios, where read and write operations can be in flight simultaneoulsy.
UPDATE: redirect_error
Just re-discovered redirect_error which allows something close to as_tuple:
auto readSome(CharT* buf, uint32_t size, error_code& ec) {
return m_socket
.async_read_some(ba::buffer(buf, size),
ba::redirect_error(ba::use_future, ec))
.get();
}
void cancel() { m_socket.cancel(); }
This only suffices when readSome and cancel are guaranteed to be invoked on the strand.

Boost asio post with shared ptr passed as argument with std::move

I am new to boost:asio. I need to pass shared_ptr as argument to handler function.
E.g.
boost::asio::post(std::bind(&::function_x, std::move(some_shared_ptr)));
Is using std::move(some_shared_ptr) correct? or should I use as below,
boost::asio::post(std::bind(&::function_x, some_shared_ptr));
If both are correct, which one is advisable?
Thanks in advance
Regards
Shankar
Bind stores arguments by value.
So both are correct and probably equivalent. Moving the argument into the bind is potentially more efficient if some_argument is not gonna be used after the bind.
Warning: Advanced Use Cases
(just skip this if you want)
Not what you asked: what if function_x took rvalue-reference arguments?
Glad you asked. You can't. However, you can still receive by lvalue reference and just move from that. because:
std::move doesn't move
The rvalue-reference is only there to indicate potentially-moved-from arguments enabling some smart compiler optimizations and diagnostics.
So, as long as you know your bound function is only executed once (!!) then it's safe to move from lvalue parameters.
In the case of shared-pointers there's actually a little bit more leeway, because moving from the shared-ptr doesn't actually move the pointed-to element at all.
So, a little exercise demonstrating it all:
Live On Coliru
#include <boost/asio.hpp>
#include <memory>
#include <iostream>
static void foo(std::shared_ptr<int>& move_me) {
if (!move_me) {
std::cout << "already moved!\n";
} else {
std::cout << "argument: " << *std::move(move_me) << "\n";
move_me.reset();
}
}
int main() {
std::shared_ptr<int> arg = std::make_shared<int>(42);
std::weak_ptr<int> observer = std::weak_ptr(arg);
assert(observer.use_count() == 1);
auto f = std::bind(foo, std::move(arg));
assert(!arg); // moved
assert(observer.use_count() == 1); // so still 1 usage
{
boost::asio::io_context ctx;
post(ctx, f);
ctx.run();
}
assert(observer.use_count() == 1); // so still 1 usage
f(); // still has the shared arg
// but now the last copy was moved from, so it's gone
assert(observer.use_count() == 0); //
f(); // already moved!
}
Prints
argument: 42
argument: 42
already moved!
Why Bother?
Why would you care about the above? Well, since in Asio you have a lot of handlers that are guaranteed to execute precisely ONCE, you can sometimes avoid the overhead of shared pointers (the synchronization, the allocation of the control block, the type erasure of the deleter).
That is, you can use move-only handlers using std::unique_ptr<>:
Live On Coliru
#include <boost/asio.hpp>
#include <memory>
#include <iostream>
static void foo(std::unique_ptr<int>& move_me) {
if (!move_me) {
std::cout << "already moved!\n";
} else {
std::cout << "argument: " << *std::move(move_me) << "\n";
move_me.reset();
}
}
int main() {
auto arg = std::make_unique<int>(42);
auto f = std::bind(foo, std::move(arg)); // this handler is now move-only
assert(!arg); // moved
{
boost::asio::io_context ctx;
post(
ctx,
std::move(f)); // move-only, so move the entire bind (including arg)
ctx.run();
}
f(); // already executed
}
Prints
argument: 42
already moved!
This is going to help a lot in code that uses a lot of composed operations: you can now bind the state of the operation into the handler with zero overhead, even if it's bigger and dynamically allocated.

How does C++ store variables captured by a lambda that have gone out of scope?

If a function returns a lambda that captures and mutates a value declared in the scope of the function, where/how is that value stored in memory so the lambda may safely use it?
This example is from listing 6.7 in 'Functional Programming in C++' by Ivan Čukić. It's a utility memoization method that caches results for fast lookup later. The contrived usage computes and then retrieves a cached Fibonacci number:
#include <iostream>
#include <map>
#include <tuple>
template <typename Result, typename... Args>
auto make_memoized(Result (*f)(Args...)) {
std::map<std::tuple<Args...>, Result> cache;
return [f, cache](Args... args) mutable -> Result {
const auto args_tuple = std::make_tuple(args...);
const auto cached = cache.find(args_tuple);
if (cached == cache.end()) {
auto result = f(args...);
cache[args_tuple] = result;
return result;
} else {
return cached->second;
}
};
}
unsigned int fib(unsigned int n) {
return n < 2 ? n : fib(n - 1) + fib(n - 2);
}
int main() {
auto fibmemo = make_memoized(fib);
std::cout << "fib(15) = " << fibmemo(15) << '\n';
std::cout << "fib(15) = " << fibmemo(15) << '\n';
}
My expectation was that cache would be destroyed when make_memoized returned, so a retrospective call to the lambda would have referred to a value that has gone out of scope. However it works fine (g++ 9.1 on OSX).
I can't find a concrete example of this sort of usage on cppreference.com. Any help leading me to the right terminology to search for is greatly appreciated.
The [f, cache] captures the vars by value. Once captured by value, the life of the captured var should be same as the lambda itself.
EDIT: If captured by reference (e.g. [f, &cache]), the life of cache and the lambda are no longer linked. So, while the code will still compile, it is no longer safe to use the returned lambda as cache has already been destroyed by then.

std::string::assign vs std::string::operator=

I coded in Borland C++ ages ago, and now I'm trying to understand the "new"(to me) C+11 (I know, we're in 2015, there's a c+14 ... but I'm working on an C++11 project)
Now I have several ways to assign a value to a string.
#include <iostream>
#include <string>
int main ()
{
std::string test1;
std::string test2;
test1 = "Hello World";
test2.assign("Hello again");
std::cout << test1 << std::endl << test2;
return 0;
}
They both work. I learned from http://www.cplusplus.com/reference/string/string/assign/ that there are another ways to use assign . But for simple string assignment, which one is better? I have to fill 100+ structs with 8 std:string each, and I'm looking for the fastest mechanism (I don't care about memory, unless there's a big difference)
Both are equally fast, but = "..." is clearer.
If you really want fast though, use assign and specify the size:
test2.assign("Hello again", sizeof("Hello again") - 1); // don't copy the null terminator!
// or
test2.assign("Hello again", 11);
That way, only one allocation is needed. (You could also .reserve() enough memory beforehand to get the same effect.)
I tried benchmarking both the ways.
static void string_assign_method(benchmark::State& state) {
std::string str;
std::string base="123456789";
// Code inside this loop is measured repeatedly
for (auto _ : state) {
str.assign(base, 9);
}
}
// Register the function as a benchmark
BENCHMARK(string_assign_method);
static void string_assign_operator(benchmark::State& state) {
std::string str;
std::string base="123456789";
// Code before the loop is not measured
for (auto _ : state) {
str = base;
}
}
BENCHMARK(string_assign_operator);
Here is the graphical comparitive solution. It seems like both the methods are equally faster. The assignment operator has better results.
Use string::assign only if a specific position from the base string has to be assigned.

C++11 chrono library - How to execute method after a specific time interval?

I want to use properly the chrono library to configure my class to call a method, after some milliseconds.
#include <iostream>
#include <chrono>
#include <ctime>
Class House
{
private:
//...
public:
House() {};
~House() {};
void method1() { std::cout << "method1 called" << std::endl; };
void method2() { std::cout << "method2 called" << std::endl; };
void method3() { std::cout << "method3 called" << std::endl; };
};
int main
{
House h;
//For the object 'h', I need to call method1() after 100ms
// ???
//For the object 'h', I need to call method2() after 200ms
// ???
//For the object 'h', I need to call method3() after 300ms
// ???
return 0;
}
Any ideas how to do this?
This is a snippet from a book I have been reading / studying since I'm just getting into C++. (I started about 3 months ago but before that I practiced Java and Python a bit.) This explains how to do what you're intending to do as well as an example to show. I could have explained it in my own words; however I feel as if this hits the nail on the head:
5.3.4.1 Waiting for Events
Sometimes, a thread needs to wait for some kind of external event, such as another thread completing a task or a certain amount of time having passed. The simplest “event” is simply time passing. Consider:
auto t0 = high_resolution_clock::now();
this_thread::sleep_for(milliseconds{20});
auto t1 = high_resolution_clock::now();
cout << duration_cast<nanoseconds>(t1 - t0).count() << " nanoseconds passed\n";
Note that I didn't even have to launch a thread; by default, this_thread refers to the one and only thread (§ 42.2.6). I used duration_cast to adjust the clock’s units to the nanoseconds I wanted. See § 5.4.1 and § 35.2 before trying anything more complicated than this with time. The time facilities are found in <chrono>.
— The C++ Programming Language 4th Edition by Bjarne Stroustrup
I feel as if using this method would help accomplish what you're trying to do: accomplish tasks one after the other. Check out <chrono>. I found this answer because of a book I was reading, this isn't my work this is from a book. If you are intending on having many tasks running simultaneously, you will need to create threads and if they happen to share a resource, you will probably need locks or just use unique_lock / lock_guard. I prefer unique_lock.

Resources