Boost process continuously read output - boost

I'm trying to read outputs/logs from different processes and display them in a GUI. The processes will be running for long time and produce huge output. I'm planning to stream the output from those processes and display them according to my needs. All the while allow my gui application to take user inputs and perform other actions.
What I've done here is, from main thread launch two threads for each process. One for launching the process and another for reading output from the process.
This is the solution I've come up thus far.
// Process Class
class MyProcess {
namespace bp = boost::process;
boost::asio::io_service mService; // member variable of the class
bp::ipstream mStream // member variable of the class
std::thread mProcessThread, mReaderThread // member variables of the class.
public void launch();
};
void
MyProcess::launch()
{
mReaderThread = std::thread([&](){
std::string line;
while(getline(mStream, line)) {
std::cout << line << std::endl;
}
});
mProcessThread = std::thread([&]() {
auto c = boost::child ("/path/of/executable", bp::std_out > mStream, mService);
mService.run();
mStream.pipe().close();
}
}
// Main Gui class
class MyGui
{
MyProcess process;
void launchProcess();
}
MyGui::launchProcess()
{
process.launch();
doSomethingElse();
}
The program is working as expected so far. But I'm not sure if this is the correct solution. Please let me know if there's any alternative/better/correct solution
Thanks,
Surya

The most striking conceptual issues I see are
Process are asynchronous, no need to add a thread to run them.¹
You prematurely close the pipe:
mService.run();
mStream.pipe().close();
Run is not "blocking" in the sense that it will not wait for the child to exit. You could use wait to achieve that. Other than that, you can just remove the close() call.
With the close means you will lose all or part of the output. You might not see any of the output if the child process takes a while before it outputs the first data.
You are accessing the mStream from multiple threads without synchronization. This invokes Undefined Behaviour because it opens a Data Race.
In this case you can remove the immediate problem by removing the mStream.close() call mentioned before, but you must take care to start the reader-thread only after the child has been initialized.
Strictly speaking the same caution should be taken for std::cout.
You are passing the io_service reference, but it's not being used. Just dropping it seems like a good idea.
The destructor of MyProcess needs to detach or join the threads. To prevent Zombies, it needs to detach or reap the child pid too.
In combination with the lifetime of mStream detaching the reader thread is not really an option, as mStream is being used from the thread.
Let's put out the first fixes first, and after that I'll suggest show some more simplifications that make sense in the scope of your sample.
First Fixes
I used a simple bash command to emulate a command generating 1000 lines of ping:
Live On Coliru
#include <boost/process.hpp>
#include <thread>
#include <iostream>
namespace bp = boost::process;
/////////////////////////
class MyProcess {
bp::ipstream mStream;
bp::child mChild;
std::thread mReaderThread;
public:
~MyProcess();
void launch();
};
void MyProcess::launch() {
mChild = bp::child("/bin/bash", std::vector<std::string> {"-c", "yes ping | head -n 1000" }, bp::std_out > mStream);
mReaderThread = std::thread([&]() {
std::string line;
while (getline(mStream, line)) {
std::cout << line << std::endl;
}
});
}
MyProcess::~MyProcess() {
if (mReaderThread.joinable()) mReaderThread.join();
if (mChild.running()) mChild.wait();
}
/////////////////////////
class MyGui {
MyProcess _process;
public:
void launchProcess();
};
void MyGui::launchProcess() {
_process.launch();
// doSomethingElse();
}
int main() {
MyGui gui;
gui.launchProcess();
}
Simplify!
In the current model, the thread doesn't pull it's weight.
I you'd use io_service with asynchronous IO instead, you could even do away with the whole thread to begin with, by polling the service from inside your GUI event loop².
If you're gonna have it, and since child processes naturally execute asynchronously³ you could simply do:
Live On Coliru
#include <boost/process.hpp>
#include <thread>
#include <iostream>
std::thread launch(std::string const& command, std::vector<std::string> args = {}) {
namespace bp = boost::process;
return std::thread([=] {
bp::ipstream stream;
bp::child c(command, args, bp::std_out > stream);
std::string line;
while (getline(stream, line)) {
// TODO likely post to some kind of queue for processing
std::cout << line << std::endl;
}
c.wait(); // reap PID
});
}
The demo displays exactly the same output as earlier.
¹ In fact, adding threads is asking for trouble with fork
² or perhaps idle tick or similar idea. Qt has a ready-made integration (How to integrate Boost.Asio main loop in GUI framework like Qt4 or GTK)
³ on all platforms supported by Boost Process

Related

boost asio: Is it thread safe to call tcp::socket::async_read_some() when handler is protected by a strand

I'm struggle to full understand Boost ASIO and strands. I was under the impression that the call to socket::async_read_some() was safe as long as the handler was wrapped in a strand. This appears not to be the case since the code eventually throws an exception.
In my situation a third party library is making the Session::readSome() calls. I'm using a reactor pattern with the ASIO layer under the third party library. When data arrives on the socket the 3rd party is called to do the read. The pattern is used since it is necessary to abort the read operation at any time and have the 3rd party library error out and return its thread. The third party expected a blocking read so the code mimics it with a conditional variable.
Given the example below what is the proper way to do this? Do I need to wrap the async_read_some() call in a dispatch() or post() so it runs through a strand too?
Note: Compiler is c++14 ;-(
Example representative code:
Session::Session (ba::io_context& ioContext):
m_sessionStrand ( ioContext.get_executor() ),
m_socket ( m_sessionStrand )
{}
int32_t Session::readSome (unsigned char* pBuffer, uint32_t bufferSizeToRead, boost::system::error_code& errorCode)
{
// The 3d party expects a synchronous read so we mimic the behavior
// with a async_read and then wait for the results. With this pattern
// we can unblock the read elsewhere - for or example calling close on the socket -
// and still give the 3d party the illusion of a synchronous read.
// In such a cases the 3rd party will receive an error code
// on the read and return it's thread.
// Nothing to do
if ( bufferSizeToRead == 0) return 0;
// Create a mutable buffer
ba::mutable_buffer buffer (pBuffer, bufferSizeToRead);
std::size_t result = 0;
errorCode.clear();
// Setup conditional
m_readerPause.exchange(true);
auto readHandler = [&result, &errorCode, self=shared_from_this()](boost::system::error_code ec, std::size_t bytesRead)
{
result = bytesRead;
errorCode = ec;
// Signal that we got results
std::unique_lock<std::mutex> lock{m_readerMutex};
m_readerPause.exchange(false);
m_readerPauseCV.notify_all();
};
m_socket.async_read_some(buffer, ba::bind_executor (m_sessionStrand, readHandler));
// We pause the 3rd party read thread until we get the read results back - or an error occurs
{
std::unique_lock<std::mutex> lock{m_readerMutex};
m_readerPauseCV.wait (lock, [this]{ return !m_readerPause.load(std::memory_order_acquire); } );
}
return result;
}
The exception occurs in epoll_reactor.ipp. There is a race condition between the read and closing the socket.
void epoll_reactor::start_op(int op_type, socket_type descriptor,
epoll_reactor::per_descriptor_data& descriptor_data, reactor_op* op,
bool is_continuation, bool allow_speculative)
{
if (!descriptor_data)
{
op->ec_ = boost::asio::error::bad_descriptor;
post_immediate_completion(op, is_continuation);
return;
}
mutex::scoped_lock descriptor_lock(descriptor_data->mutex_);
if (descriptor_data->shutdown_) //!! SegFault here: descriptor_data == NULL*
{
post_immediate_completion(op, is_continuation);
return;
}
...
}
Thanks in advance for any insights in the proper way to handle this situation using ASIO.
The strand doesn't "protect" the handler. Instead, it protects some shared state (which you control) by synchronizing handler execution. It's exactly like a mutex for async execution.
According to this logic all code running on the strand can touch the shared resources, and conversely, code not guaranteed to be on the strand can not be allowed to touch them.
In your code, the shared resources consist of at least buffer, result, m_socket. It would be more complete to include the m_sessionStrand, m_readerPauseCV, m_readerMutex, m_readerPause but all of these are implicitly threadsafe the way they are used¹.
Your code looks to do things safely in these regards. However it makes a few unfortunate detours that make it harder than necessary to check/reason about the code:
it uses more (local) shared state to communicate results from the handler
it doesn't make explicit what the mutex and/or the strand protect
it employs both a mutex and a strand which conceptually compete for the same responsibility
it employs both a condition and an atomic bool, which again compete for the same responsibility
it does manual strand binding, which muddies the expectations about what the native executor for the m_socket object is expected to be
the initial read is not protected. This means that if Session::readSome is invoked from a "wild" thread, it will use member functions without synchronizing with any other operations that may be pending on the m_socket.
the atomic_bool mutations are spelled in Very Convoluted Ways(TM), which serve to show you (presumably) understand the memory model, but make the code harder to review without tangible merit. Clearly, the blocking synchronization will (far) outweigh any benefit of explicit memory acquisition order. I suggest to at least "normalize" the spelling as atomic_bool was explicitly designed to afford:
//m_readerPause.exchange(true);
m_readerPause = true;
and
m_readerPauseCV.wait(lock, [this] { return !m_readerPause; });
since you are emulating blocking IO, there is no merit capturing shared_from_this() in the lambda. Lifetime should be guaranteed by the calling party any ways.
Interestingly, you didn't show this capture, which is required for the lambda to compile, assuming you didn't use global variables.
Kudos for explicitly clearing the error_code output variable. This is oft forgotten. Technically, you did forget about with the (questionable?) early exit when (bufferSizeToRead == 0)... You might have a slightly unorthodox caller contract where this makes sense.
To be generic I'd suggest to perform the zero-length read as it might behave differently depending on the transport connected.
Last, but not least, m_socket.[async_]read_some is rarely what you require on application protocol level. I'll leave this one to you, as you might have this exceptional edge-case scenario.
Simplifying
Conceptually, I'd like to write:
int32_t Session::readSome(unsigned char* buf, uint32_t size, error_code& ec) {
ec.clear();
size_t result = 0;
std::tie(ec, result) = m_socket
.async_read_some(ba::buffer(buf, size),
ba::as_tuple(ba::use_future))
.get();
return result;
}
This uses futures to get the blocking behaviour while being cancelable. Sadly, contrary to expectation there is currently a limitation that prevents combining as_tuple and use_future.
So, we have to either ignore partial success scenarios (significant result when !ec):
int32_t Session::readSome(unsigned char* buf, uint32_t size, error_code& ec) try {
ec.clear();
return m_socket
.async_read_some(ba::buffer(buf, size), ba::use_future)
.get();
} catch (boost::system::system_error const& se) {
ec = se.code();
return 0;
}
I suspect that member-async_read_some doesn't have a partial success mode. However, let's still give it thought, seeing that I warned before that async_read_some is rarely what you need anyways:
int32_t Session::readSome(unsigned char* buf, uint32_t size, error_code& ec) {
std::promise<std::tuple<size_t, error_code> > p;
m_socket.async_read_some(ba::buffer(buf, size), [&p](error_code ec_, size_t n_) { p.set_value({n_, ec_}); });
size_t result;
std::tie(result, ec) = p.get_future().get();
return result;
}
Still considerably easier.
Interim Result
Self contained example with the current approach:
Live On Coliru
#include <boost/asio.hpp>
namespace ba = boost::asio;
using ba::ip::tcp;
using boost::system::error_code;
using CharT = /*unsigned*/ char; // for ease of output...
struct Session : std::enable_shared_from_this<Session> {
tcp::socket m_socket;
Session(ba::any_io_executor ex) : m_socket(make_strand(ex)) {
m_socket.connect({{}, 7878});
}
int32_t readSome(CharT* buf, uint32_t size, error_code& ec) {
std::promise<std::tuple<size_t, error_code>> p;
m_socket.async_read_some(ba::buffer(buf, size), [&p](error_code ec_, size_t n_) {
p.set_value({n_, ec_});
});
size_t result;
std::tie(result, ec) = p.get_future().get();
return result;
}
};
#include <iomanip>
#include <iostream>
int main() {
ba::thread_pool ioc;
auto s = std::make_shared<Session>(ioc.get_executor());
error_code ec;
CharT data[10];
while (auto n = s->readSome(data, 10, ec))
std::cout << "Received " << quoted(std::string(data, n)) << " (" << ec.message() << ")\n";
ioc.join();
}
Testing with
g++ -std=c++14 -O2 -Wall -pedantic -pthread main.cpp
for resp in FOO LONG_BAR_QUX_RESPONSE; do nc -tln 7878 -w 0 <<< $resp; done&
set -x
sleep .2; ./a.out
sleep .2; ./a.out
Prints
+ sleep .2
+ ./a.out
Received "FOO
" (Success)
+ sleep .2
+ ./a.out
Received "LONG_BAR_Q" (Success)
Received "UX_RESPONS" (Success)
Received "E
" (Success)
External Synchronization (Cancellation?)
Now, code not show implies that other operations may act on m_socket, if at least only to cancel operations in flight³. If this situation arises you have add the missing synchronization, either using the mutex or the strand.
I suggest not introducing the competing synchronization mechanism, even though not "incorrect". It will
lead to simpler code
allow you to solidify your understanding of the use of the strand.
So, let's make sure that the operation runs on the strand:
int32_t readSome(CharT* buf, uint32_t size, error_code& ec) {
std::promise<size_t> p;
post(m_socket.get_executor(), [&] {
m_socket.async_read_some(ba::buffer(buf, size),
[&](error_code ec_, size_t n_) { ec = ec_; p.set_value(n_); });
});
return p.get_future().get();
}
void cancel() {
post(m_socket.get_executor(),
[self = shared_from_this()] { self->m_socket.cancel(); });
}
See it Live On Coliru
Exercising Cancellation
int main() {
ba::thread_pool ioc(1);
auto s = std::make_shared<Session>(ioc.get_executor());
std::thread th([&] {
std::this_thread::sleep_for(5s);
s->cancel();
});
error_code ec;
CharT data[10];
do {
auto n = s->readSome(data, 10, ec);
std::cout << "Received " << quoted(std::string(data, n)) << " (" << ec.message() << ")\n";
} while (!ec);
ioc.join();
th.join();
}
Again, Live On Coliru
¹ Technically in a multi-thread situation you need to notify the CV under the lock to allow for fair scheduling, i.e. to prevent waiter starvation. However your scenario is so isolated that you can get away with being somewhat sloppy.
² by default tcp::socket type-erases the executor with any_io_executor, but you could use basic_stream_socket<tcp, strand<io_context::executor_type> > to remove that cost if your executor type is statically known
³ Of course, POSIX sockets include full duplex scenarios, where read and write operations can be in flight simultaneoulsy.
UPDATE: redirect_error
Just re-discovered redirect_error which allows something close to as_tuple:
auto readSome(CharT* buf, uint32_t size, error_code& ec) {
return m_socket
.async_read_some(ba::buffer(buf, size),
ba::redirect_error(ba::use_future, ec))
.get();
}
void cancel() { m_socket.cancel(); }
This only suffices when readSome and cancel are guaranteed to be invoked on the strand.

keyboard interrupt routine visual studio C++ console app

I am using VS 2022 Preview to write a C++ console application. I wish to detect a keyboard hit and have my interrupt handler function called. I want the key press detected quickly in case main is in a long loop and therefore not using kbhit().
I found signal() but the debugger stops when the Control-C is detected. Maybe it is a peculiarity of the IDE. Is there a function or system call that I should use?
Edit: I am vaguely aware of threads. Could I spawn a thread that just watches kbd and then have it raise(?) an interrupt when a key is pressed?
I was able to do it by adding a thread. On the target I will have real interrupts to trigger my ISR but this is close enough for algorithm development. It seemed that terminating the thread was more trouble than it was worth so I rationalized that I am simulating an embedded system that does not need fancy shutdowns.
I decided to just accept one character at a time in the phony ISR then I can buffer them and wait and process the whole string when I see a CR, a simple minded command line processor.
// Scheduler.cpp : This file contains the 'main' function. Program execution begins and ends there.
//
#include <Windows.h>
#include <iostream>
#include <thread>
#include <conio.h>
void phonyISR(int tbd)
{
char c;
while (1)
{
std::cout << "\nphonyISR() waiting for kbd input:";
c = _getch();
std::cout << "\nGot >" << c << "<";
}
}
int main(int argc, char* argv[])
{
int tbd;
std::thread t = std::thread(phonyISR, tbd);
// Main thread doing its stuff
int i = 0;
while (1)
{
Sleep(2000);
std::cout << "\nMain: " << i++;
}
return 0;
}

Boost asio post with shared ptr passed as argument with std::move

I am new to boost:asio. I need to pass shared_ptr as argument to handler function.
E.g.
boost::asio::post(std::bind(&::function_x, std::move(some_shared_ptr)));
Is using std::move(some_shared_ptr) correct? or should I use as below,
boost::asio::post(std::bind(&::function_x, some_shared_ptr));
If both are correct, which one is advisable?
Thanks in advance
Regards
Shankar
Bind stores arguments by value.
So both are correct and probably equivalent. Moving the argument into the bind is potentially more efficient if some_argument is not gonna be used after the bind.
Warning: Advanced Use Cases
(just skip this if you want)
Not what you asked: what if function_x took rvalue-reference arguments?
Glad you asked. You can't. However, you can still receive by lvalue reference and just move from that. because:
std::move doesn't move
The rvalue-reference is only there to indicate potentially-moved-from arguments enabling some smart compiler optimizations and diagnostics.
So, as long as you know your bound function is only executed once (!!) then it's safe to move from lvalue parameters.
In the case of shared-pointers there's actually a little bit more leeway, because moving from the shared-ptr doesn't actually move the pointed-to element at all.
So, a little exercise demonstrating it all:
Live On Coliru
#include <boost/asio.hpp>
#include <memory>
#include <iostream>
static void foo(std::shared_ptr<int>& move_me) {
if (!move_me) {
std::cout << "already moved!\n";
} else {
std::cout << "argument: " << *std::move(move_me) << "\n";
move_me.reset();
}
}
int main() {
std::shared_ptr<int> arg = std::make_shared<int>(42);
std::weak_ptr<int> observer = std::weak_ptr(arg);
assert(observer.use_count() == 1);
auto f = std::bind(foo, std::move(arg));
assert(!arg); // moved
assert(observer.use_count() == 1); // so still 1 usage
{
boost::asio::io_context ctx;
post(ctx, f);
ctx.run();
}
assert(observer.use_count() == 1); // so still 1 usage
f(); // still has the shared arg
// but now the last copy was moved from, so it's gone
assert(observer.use_count() == 0); //
f(); // already moved!
}
Prints
argument: 42
argument: 42
already moved!
Why Bother?
Why would you care about the above? Well, since in Asio you have a lot of handlers that are guaranteed to execute precisely ONCE, you can sometimes avoid the overhead of shared pointers (the synchronization, the allocation of the control block, the type erasure of the deleter).
That is, you can use move-only handlers using std::unique_ptr<>:
Live On Coliru
#include <boost/asio.hpp>
#include <memory>
#include <iostream>
static void foo(std::unique_ptr<int>& move_me) {
if (!move_me) {
std::cout << "already moved!\n";
} else {
std::cout << "argument: " << *std::move(move_me) << "\n";
move_me.reset();
}
}
int main() {
auto arg = std::make_unique<int>(42);
auto f = std::bind(foo, std::move(arg)); // this handler is now move-only
assert(!arg); // moved
{
boost::asio::io_context ctx;
post(
ctx,
std::move(f)); // move-only, so move the entire bind (including arg)
ctx.run();
}
f(); // already executed
}
Prints
argument: 42
already moved!
This is going to help a lot in code that uses a lot of composed operations: you can now bind the state of the operation into the handler with zero overhead, even if it's bigger and dynamically allocated.

C++ Async function not launched asynchronously

I am trying to launch a function asynchronously but it gets launched synchronously.
#include <thread>
#include <future>
#include <vector>
#include <iostream>
#include <algorithm>
using namespace std;
std::future<int> setPromise()
{
auto promise = std::make_shared<std::promise<int>>();
auto future = promise->get_future();
auto asyncFn = [&]() {
cout << "Async started ...\n";
for(int i=0; i<100000; i++)
for(int j=0; j<10000; j++) {}
promise->set_value(400);
fprintf(stderr, "Async ended ...\n");
};
std::async(std::launch::async, asyncFn);
return future;
}
int main()
{
std::future<int> result = setPromise();
cout << "Asynchronously launched \n";
int ret = result.get();
cout << ret << endl;
return 0;
}
Compiled it with the following command
g++ -std=c++11 -pthread promise.cpp -o promise
I expect the lambda function to get called asynchronously and while the loop is running in asynchronous thread i expect the logs from the main. But i see the function never gets launched asynchronously and always the lambda gets completed and only then we get the next statements in main to be executed
What i expect
Async started ...
Asynchronously launched
Async ended ...
What i get is
Async started ...
Async ended ...
Asynchronously launched
By calling below line
std::async(std::launch::async, asyncFn);
is created temporary future object, and its destructor ends only if task started by async finishes. So at the end of scope of setPromise function its execution is blocked until job - asyncFn ends.
You can read about behaviour future destrcutor here and what happens when shared state of future is not ready.
It probably is running asynchronously just completes quickly.
To confirm for sure, you need to make your logging race condition free.
Something like this (just the idea):
std::future<int> setPromise()
{
std::atomic_flag canGo = ATOMIC_FLAG_INIT;
auto asyncFn = [&] {
while (!canGo);
log("Async started ..."); // also use thread-safe logging
...
}
std::async(std::launch::async, asyncFn);
log("letting it go...");
canGo.test_and_set();
...
}
Note also, that iostream is not thread safe, so you better use a thread safe logger when experimenting.

C++11 chrono library - How to execute method after a specific time interval?

I want to use properly the chrono library to configure my class to call a method, after some milliseconds.
#include <iostream>
#include <chrono>
#include <ctime>
Class House
{
private:
//...
public:
House() {};
~House() {};
void method1() { std::cout << "method1 called" << std::endl; };
void method2() { std::cout << "method2 called" << std::endl; };
void method3() { std::cout << "method3 called" << std::endl; };
};
int main
{
House h;
//For the object 'h', I need to call method1() after 100ms
// ???
//For the object 'h', I need to call method2() after 200ms
// ???
//For the object 'h', I need to call method3() after 300ms
// ???
return 0;
}
Any ideas how to do this?
This is a snippet from a book I have been reading / studying since I'm just getting into C++. (I started about 3 months ago but before that I practiced Java and Python a bit.) This explains how to do what you're intending to do as well as an example to show. I could have explained it in my own words; however I feel as if this hits the nail on the head:
5.3.4.1 Waiting for Events
Sometimes, a thread needs to wait for some kind of external event, such as another thread completing a task or a certain amount of time having passed. The simplest “event” is simply time passing. Consider:
auto t0 = high_resolution_clock::now();
this_thread::sleep_for(milliseconds{20});
auto t1 = high_resolution_clock::now();
cout << duration_cast<nanoseconds>(t1 - t0).count() << " nanoseconds passed\n";
Note that I didn't even have to launch a thread; by default, this_thread refers to the one and only thread (§ 42.2.6). I used duration_cast to adjust the clock’s units to the nanoseconds I wanted. See § 5.4.1 and § 35.2 before trying anything more complicated than this with time. The time facilities are found in <chrono>.
— The C++ Programming Language 4th Edition by Bjarne Stroustrup
I feel as if using this method would help accomplish what you're trying to do: accomplish tasks one after the other. Check out <chrono>. I found this answer because of a book I was reading, this isn't my work this is from a book. If you are intending on having many tasks running simultaneously, you will need to create threads and if they happen to share a resource, you will probably need locks or just use unique_lock / lock_guard. I prefer unique_lock.

Resources