I am trying to parse a stream by using popen with a command that returns a constant stream of output lines.
This makes the application to get stuck on the fgets() call.
Here's the method:
std::string MyClass::InvokeCmd(std::string command)
{
std::string result;
std::array<char, 128> buffer;
FILE *pipe = popen(command.c_str(), "r");
while (fgets(buffer.data(), 128, pipe) != NULL)
{
result += buffer.data();
}
}
pclose(pipe);
return result;
}
The command is a ROS command:
rostopic hz /topicname
The command runs continuously and produces one line of output approximately every second.
If I wait for around 30 sec (looks like flush time of a buffer) I do see the data.
This looks like buffering inside the rostopic utility. When stdout goes to a terminal, many C libraries are smart enough to flush every time '\n' is written. When stdout goes to a pipe, the library would add a large buffer. Looks like it takes 30 seconds to fill it.
To test this theory, try rostopic hz /topicname | cat in command line.
There is not much that can be done, please see this question.
I suggest you pit some asserting that pipe is not nullptr. If it is nullptr, then perhaps print out command, errno and strerror().
Related
In ruby code I am running a system call with Open3.popen3 and using the resultant IO for stdout and stderr to do some log message formatting before writing to one log file. I was wondering what would be the best way to do this so log messages will maintain the correct order, note I need to do separate formatting for error messages as for stdout messages.
Here's my current code (Assume logger is thread safe)
Open3.popen3("my_custom_script with_some_args") do |_in, stdout, stderr|
stdout_thr = Thread.new do
while line = stdout.gets.chomp
logger.info(format(:info, line))
end
end
stderr_thr = Thread.new do
while line = stderr.gets.chomp
logger.error(format(:error, line))
end
end
[stdout_thr, stderr_thr].each(&:join)
end
This has worked for me so far, but I'm not so confident that I can guarantee the correct order of the log messages. Is there a better way?
What you're trying to achieve is not possible with a guarantee. First thing to note is that your code could only possibly order based on the time that the data was received, not when it was produced, which is not quite the same. The only way to guarantee this would be to do something on the source which will add some guaranteed ordering between the two systems.
The below code should make it "more likely" to be correct by removing the threads. Assuming that you're using MRI, the threads are "green" so technically can't be running at the same time. That means you're beholden upon the scheduler choosing to run your thread at the "right" time.
Open3.popen3("my_custom_script with_some_args") do |_in, stdout, stderr|
for_reading = [stdout, stderr]
until(for_reading.empty?) do
wait_timeout = 1
# IO.select blocks until one of the streams is has something to read
# or the wait timeout is reached
readable, _writable, errors = IO.select(for_reading, [], [], wait_timeout)
# readable is nil in the case of a timeout - loop back again
if readable.nil?
Thread.pass
else
# In the case that both streams are readable (and thus have content)
# read from each of them. In this case, we cannot guarantee any order
# because we recieve the items at essentially the same time.
# We can still ensure that we don't mix data incorrectly.
readable.each do |stream|
buffer = ''
# loop through reading data until there is an EOF (value is nil)
# or there is no more data to read (value is empty)
while(true) do
tmp = stream.read_nonblock(4096, buffer, exception: false)
if tmp.nil?
# stream is EOF - nothing more to read on that one..
for_reading -= [stream]
break
elsif tmp.empty? || tmp == :wait_readable
# nothing more to read right now...
# continue on to process the buffer into lines and log them
break
end
end
if stream == stdout
buffer.split("\n").each { |line| logger.info(format(:info, line)) }
elsif stream == stderr
buffer.split("\n").each { |line| logger.info(format(:error, line)) }
end
end
end
end
end
Note that in a system generating a lot of output in a very short period of time there is more likely to be an overlap where things get out of order. This likelihood increases with the amount time taken to read the stream and process it. It would be best to ensure that the absolute minimum processing is done inside the loop. If the formatting (and writing) are expensive, consider moving those items into a separate thread reading from a single queue, and have the code inside the loop only push the buffer (and source identifier) onto the queue.
I'm trying to write a DTrace script which does the following:
Whenever a new thread is started, increment a count.
Whenever one of these threads exits, decrement the count, and exit the script if the count is now zero.
I have something like this:
BEGIN {
threads_alive = 0;
}
proc:::lwp-start /execname == $$1/ {
self->started = timestamp;
threads_alive += 1;
}
proc:::lwp-exit /self->started/ {
threads_alive -= 1;
if (threads_alive == 0) {
exit(0);
}
}
However, this doesn't work, because threads_alive is a scalar variable and thus it is not multi-cpu safe. As a result, multiple threads will overwrite each other's changes to the variable.
I have also tried using an aggregate variable instead:
#thread_count = sum(1)
//or
#threads_entered = count();
#threads_exitted = count();
Unfortunately, I haven't found syntax to be able to do something like #thread_count == 0 or #threads_started == #threads_stopped.
DTrace doesn't have facilities for doing the kind of thread-safe data sharing you're proposing, but you have a few options depending on precisely what you're trying to do.
If the executable name is unique, you can use the proc:::start and proc:::exit probes for the start of the first thread and the exit of the last thread respectively:
proc:::start
/execname == $$1/
{
my_pid = pid;
}
proc:::exit
/pid == my_pid/
{
exit(0);
}
If you're using the -c option to dtrace, the BEGIN probe fires very shortly after the corresponding proc:::start. Internally, dtrace -c starts the specified forks the specified command and then starts tracing at one of four points: exec (before the first instruction of the new program), preinit (after ld has loaded all libraries), postinit (after each library's _init has run), or main (right before the first instruction of the program's main function, though this is not supported in macOS).
If you use dtrace -x evaltime=exec -c <program> BEGIN will fire right before the first instruction of the program executes:
# dtrace -xevaltime=exec -c /usr/bin/true -n 'BEGIN{ts = timestamp}' -n 'pid$target:::entry{printf("%dus", (timestamp - ts)/1000); exit(0); }'
dtrace: description 'BEGIN' matched 1 probe
dtrace: description 'pid$target:::entry' matched 1767 probes
dtrace: pid 1848 has exited
CPU ID FUNCTION:NAME
10 16757 _dyld_start:entry 285us
The 285us is due to the time it takes dtrace to resume the process via /proc or ptrace(2) on macOS. Rather than proc:::start or proc:::lwp-start you may be able to use BEGIN, pid$target::_dyld_start:entry, or pid$target::main:entry.
I am trying to continously read a file in ruby (which is growing over time and needs to be processed in a separate process). Currently I am archiving this with the following bit of code:
r,w = IO.pipe
pid = Process.spawn('ffmpeg'+ffmpeg_args, {STDIN => r, STDERR => STDOUT})
Process.detach pid
while true do
IO.copy_stream(open(#options['filename']), w)
sleep 1
end
However - while working - I can't imagine that this is the most performant way of doing it. An alternative would be to use the following variation:
step = 1024*4
copied = 0
pid = Process.spawn('ffmpeg'+ffmpeg_args, {STDIN => r, STDERR => STDOUT})
Process.detach pid
while true do
IO.copy_stream(open(#options['filename']), w, step, copied)
copied += step
sleep 1
end
which would only continously copy parts of the file (the issue here being if the step should ever overreach the end of the file). Other approaches such a simple read-file led to ffmpeg failing when there was no new data. With this solution the frames are dropped if no new data is available (which is what I need).
Is there a better way (more performant) to archive something like that?
EDIT:
Using the method proposed by #RaVeN I am now using the following code:
open(#options['filename'], 'rb') do |stream|
stream.seek(0, IO::SEEK_END)
queue = INotify::Notifier.new
queue.watch(#options['filename'], :modify) do
w.write stream.read
end
queue.run
end
However now ffmpeg complaints about invalid data. Is there another method than read?
I have the following method, the idea is to run a shell command and both stream output to stdout as its recieved and store the information as a variable so I can return a hash of the information, I found no standard way of doing this (you either get streaming or captured output).
It does this by creating a forks to stream the output and append to an IO pipe that I can read in at a later date.
def self.run_cmd(cmd)
stdout_rd, stdout_wr = IO.pipe
stderr_rd, stderr_wr = IO.pipe
status = Open4::popen4(cmd) do |_pid, _stdin, _stdout, _stderr|
pids = []
pids << fork do
_stdout.each_line do |l|
print l
stdout_wr.puts l
end
end
pids << fork do
_stderr.each_line do |l|
print l
stderr_wr.puts l
end
end
pids.each{|pid| Process.waitpid(pid)}
end
stdout_wr.close
stderr_wr.close
out = stdout_rd.gets
out = '' if out.nil?
err = stderr_rd.gets
err = '' if err.nil?
{ stdout: out, stderr: err, status: status.exitstatus }
end
This works great in most scenarios but specifically unzip doesn't play well with this approach, what happens is after a fixed amount of output from zip it will stall at stdout_wr.puts l
I've observed that when the ruby process has stalled that a zombie unzip is visible when running ps
Is there any way I can make this work?
Is there a better way of doing this? I appreciate that its a complex solution and it must be easier.
My potential idea is that my IO pipe is running out of buffered space but I'm able to print 10,000 lines of output without issue.
I have a program which is calling another program and processing the child's output, ie:
my $pid = open($handle, "$commandPath $options |");
Now I've tried a couple different ways to read from the handle without blocking with little or no success.
I found related questions:
perl-win32-how-to-do-a-non-blocking-read-of-a-filehandle-from-another-process
why-does-my-perl-sysread-block-when-reading-from-a-socket
But they suffer from the problems:
ioctl consistently crashes perl
sysread blocks on 0 bytes (a common occurrence)
I'm not sure how to go about solving this problem.
Pipes are not as functional on Windows as they are on Unix-y systems. You can't use the 4-argument select on them and the default capacity is miniscule.
You are better off trying a socket or file based workaround.
$pid = fork();
if (defined($pid) && $pid == 0) {
exit system("$commandPath $options > $someTemporaryFile");
}
open($handle, "<$someTemporaryFile");
Now you have a couple more cans of worms to deal with -- running waitpid periodically to check when the background process has stopped creating output, calling seek $handle,0,1 to clear the eof condition after you read from $handle, cleaning up the temporary file, but it works.
I have written the Forks::Super module to deal with issues like this (and many others). For this problem you would use it like
use Forks::Super;
my $pid = fork { cmd => "$commandPath $options", child_fh => "out" };
my $job = Forks::Super::Job::get($pid);
while (!$job->is_complete) {
#someInputToProcess = $job->read_stdout();
... process input ...
... optional sleep here so you don't consume CPU waiting for input ...
}
waitpid $pid, 0;
#theLastInputToProcess = $job->read_stdout();