worker thread does not yield to the main thread - windows

I have a minimal reproducible case where a worker thread is not yielding to the main thread, or IOW, the scheduler is not triggering a context switch to the main thread, or IOOW, the task in the worker thread is blocking the main thread.
I'm on Windows 10.0.18362 and using ActiveState Perl 5.28.
foo.pl:
use strict;
use File::Basename;
use threads;
use Data::Dumper;
# sub yield {
# sleep(1);
# }
sub wait_for_notepad {
my #notepads = ();
while (!scalar(#notepads)) {
#notepads = grep /notepad\.exe/, qx(tasklist);
print(Dumper(\#notepads));
# yield();
}
}
sub main {
my $thr = threads->create(sub {
# yield();
system("cmd /c \"notepad.exe\"");
});
my $script_dir = dirname($0);
my $output = qx(perl $script_dir\\bar.pl);
print("$output");
wait_for_notepad();
system("taskkill /im notepad.exe /f /t");
$thr->join();
}
main();
bar.pl (place in the same directory as foo.pl)
use strict;
print("hello from bar.pl\n");
Then run with perl path/to/foo.pl.
The docs mention a threads->yield(), but it also mentions that it's a no-op on most systems, which seems to be how it's behaving for me. Another option I found was to use a sleep() right before the worker thread calls system() (the commented parts of the code). Are there any other alternatives that would help me tell the worker thread to yield to the main thread?
Edit:
Sorry for being unclear. The exact sequence of behavior I'm seeing from this is when the script is ran, notepad opens up. I'm expecting to see bar.pl print "hello from bar.pl" in the console concurrently, but that does not happen. The main thread sits there blocked for several seconds with nothing printed to stdout until I interactively close notepad. Is this expected behavior? Perhaps there's a better way to achieve parallel/multi processing in perl, or I'm just doing this wrong.
Edit:
Also, if I edit the line where I have my $output = qx(perl $script_dir\\bar.pl); to system("perl $script_dir\\bar.pl"), the problem disappears and no blocking occurs.
Edit:
Title may be misleading. Sorry about that. I'm bad at making titles when it's late in the day =P. By "yield", I mean some worker thread gets interrupted to give some time to the main thread. But that's weird too b.c. if each thread is running on their own core then you'd get parallel processing anyway, so "yielding" makes no sense in this context. Anyway, never mind all that; in any case, the title should've been something more like "process deadlock when using perl threads api with no synchronization primitives". That's probably more accurate here.

Related

Ruby spawn process, capturing STDOUT/STDERR, while behaving as if it were spawned regularly

What I'm trying to achieve:
From a Ruby process, spawning a subprocess
The subprocess should print as normal back to the terminal. By "normal", I mean the process shouldn't miss out color output, or ignore user input (STDIN).
For that subprocess, capturing STDOUT/STDERR (jointly) e.g. into a String variable that can be accessed after the subprocess is dead. Escape characters and all.
Capturing STDOUT/STDERR is possible by passing a different IO pipe, however the subprocess can then detect that it's not in a tty. For example git log will not print characters that influence text color, nor use it's pager.
Using a pty to launch the process essentially "tricks" the subprocess into thinking it's being launched by a user. As far as I can tell, this is exactly what I want, and the result of this essentially ticks all the boxes.
My general tests to test if a solution fits my needs is:
Does it run ls -al normally?
Does it run vim normally?
Does it run irb normally?
The following Ruby code is able to check all the above:
to_execute = "vim"
output = ""
require 'pty'
require 'io/console'
master, slave = PTY.open
slave.raw!
pid = ::Process.spawn(to_execute, :in => STDIN, [:out, :err] => slave)
slave.close
master.winsize = $stdout.winsize
Signal.trap(:WINCH) { master.winsize = $stdout.winsize }
Signal.trap(:SIGINT) { ::Process.kill("INT", pid) }
master.each_char do |char|
STDOUT.print char
output.concat(char)
end
::Process.wait(pid)
master.close
This works for the most part but it turns out it's not perfect. For some reason, certain applications seem to fail to switch into a raw state. Even though vim works perfectly fine, it turned out neovim did not. At first I thought it was a bug in neovim but I have since been able to reproduce the problem using the Termion crate for the Rust language.
By setting to raw manually (IO.console.raw!) before executing, applications like neovim behave as expected, but then applications like irb do not.
Oddly spawning another pty in Python, within this pty, allows the application to work as expected (using python -c 'import pty; pty.spawn("/usr/local/bin/nvim")'). This obviously isn't a real solution, but interesting nonetheless.
For my actual question I guess I'm looking towards any help to resolving the weird raw issue or, say if I've completely misunderstood tty/pty, any different direction to where/how I should look at the problem.
[edited: see the bottom for the amended update]
Figured it out :)
To really understand the problem I read up a lot on how a PTY works. I don't think I really understood it properly until I drew it out. Basically PTY could be used for a Terminal emulator, and that was the simplest way to think of the data flow for it:
keyboard -> OS -> terminal -> master pty -> termios -> slave pty -> shell
|
v
monitor <- OS <- terminal <- master pty <- termios
(note: this might not be 100% correct, I'm definitely no expert on the subject, just posting it incase it helps anybody else understand it)
So the important bit in the diagram that I hadn't really realised was that when you type, the only reason you see your input on screen is because it's passed back (left-wards) to the master.
So first thing's first - this ruby script should first set the tty to raw (IO.console.raw!), it can restore it after execution is finished (IO.console.cooked!). This'll make sure the keyboard inputs aren't printed by this parent Ruby script.
Second thing is the slave itself should not be raw, so the slave.raw! call is removed. To explain this, I originally added this because it removes extra return carriages from the output: running echo hello results in "hello\r\n". What I missed was that this return carriage is a key instruction to the terminal emulator (whoops).
Third thing, the process should only be talking to the slave. Passing STDIN felt convenient, but it upsets the flow shown in the diagram.
This brings up a new problem on how to pass user input through, so I tried this. So we basically pass STDIN to the master:
input_thread = Thread.new do
STDIN.each_char do |char|
master.putc(char) rescue nil
end
end
that kind of worked, but it has its own issues in terms of some interactive processes weren't receiving a key some of the time. Time will tell, but using IO.copy_stream instead appears to solve that issue (and reads much nicer of course).
input_thread = Thread.new { IO.copy_stream(STDIN, master) }
update 21st Aug:
So the above example mostly worked, but for some reason keys like CTRL+c still wouldn't behave correctly. I even looked up other people's approach to see what I could be doing wrong, and effectively it seemed the same approach - as IO.copy_stream(STDIN, master) was successfully sending 3 to the master. None of the following seemed to help at all:
master.putc 3
master.putc "\x03"
master.putc "\003"
Before I went and delved into trying to achieve this in a lower level language I tried out 1 more thing - the block syntax. Apparently the block syntax magically fixes this problem.
To prevent this answer getting a bit too verbose, the following appears to work:
require 'pty'
require 'io/console'
def run
output = ""
IO.console.raw!
input_thread = nil
PTY.spawn('bash') do |read, write, pid|
Signal.trap(:WINCH) { write.winsize = STDOUT.winsize }
input_thread = Thread.new { IO.copy_stream(STDIN, write) }
read.each_char do |char|
STDOUT.print char
output.concat(char)
end
Process.wait(pid)
end
input_thread.kill if input_thread
IO.console.cooked!
end
Bundler.send(:with_env, Bundler.clean_env) do
run
end

How to stop a process from within the tests, when testing a never-ending process?

I am developing a long-running program in Ruby. I am writing some integration tests for this. These tests need to kill or stop the program after starting it; otherwise the tests hang.
For example, with a file bin/runner
#!/usr/bin/env ruby
while true do
puts "Hello World"
sleep 10
end
The (integration) test would be:
class RunReflectorTest < TestCase
test "it prints a welcome message over and over" do
out, err = capture_subprocess_io do
system "bin/runner"
end
assert_empty err
assert_includes out, "Hello World"
end
end
Only, obviously, this will not work; the test starts and never stops, because the system call never ends.
How should I tackle this? Is the problem in system itself, and would Kernel#spawn provide a solution? If so, how? Somehow the following keeps the out empty:
class RunReflectorTest < TestCase
test "it prints a welcome message over and over" do
out, err = capture_subprocess_io do
pid = spawn "bin/runner"
sleep 2
Process.kill pid
end
assert_empty err
assert_includes out, "Hello World"
end
end
. This direction also seems like it will cause a lot of timing-issues (and slow tests). Ideally, a reader would follow the stream of STDOUT and let the test pass as soon as the string is encountered and then immediately kill the subprocess. I cannot find how to do this with Process.
Test Behavior, Not Language Features
First, what you're doing is a TDD anti-pattern. Tests should focus on behaviors of methods or objects, not on language features like loops. If you must test a loop, construct a test that checks for a useful behavior like "entering an invalid response results in a re-prompt." There's almost no utility in checking that a loop loops forever.
However, you might decide to test a long-running process by checking to see:
If it's still running after t time.
If it's performed at least i iterations.
If a loop exits properly given certain input or upon reaching a boundary condition.
Use Timeouts or Signals to End Testing
Second, if you decide to do it anyway, you can just escape the block with Timeout::timeout. For example:
require 'timeout'
# Terminates block
Timeout::timeout(3) { `sleep 300` }
This is quick and easy. However, note that using timeout doesn't actually signal the process. If you run this a few times, you'll notice that sleep is still running multiple times as a system process.
It's better is to signal the process when you want to exit with Process::kill, ensuring that you clean up after yourself. For example:
pid = spawn 'sleep 300'
Process::kill 'TERM', pid
sleep 3
Process::wait pid
Aside from resource issues, this is a better approach when you're spawning something stateful and don't want to pollute the independence of your tests. You should almost always kill long-running (or infinite) processes in your test teardown whenever you can.
Ideally, a reader would follow the stream of STDOUT and let the test pass as soon as the string is encountered and then immediately kill the subprocess. I cannot find how to do this with Process.
You can redirect stdout of spawned process to any file descriptor by specifying out option
pid = spawn(command, :out=>"/dev/null") # write mode
Documentation
Example of redirection
With the answer from CodeGnome on how to use Timeout::timeout and the answer from andyconhin on how to redirect Process::spawn IO, I came up with two Minitest helpers that can be used as follows:
it "runs a deamon" do
wait_for(timeout: 2) do
wait_for_spawned_io(regexp: /Hello World/, command: ["bin/runner"])
end
end
The helpers are:
def wait_for(timeout: 1, &block)
Timeout::timeout(timeout) do
yield block
end
rescue Timeout::Error
flunk "Test did not pass within #{timeout} seconds"
end
def wait_for_spawned_io(regexp: //, command: [])
buffer = ""
begin
read_pipe, write_pipe = IO.pipe
pid = Process.spawn(command.shelljoin, out: write_pipe, err: write_pipe)
loop do
buffer << read_pipe.readpartial(1000)
break if regexp =~ buffer
end
ensure
read_pipe.close
write_pipe.close
Process.kill("INT", pid)
end
buffer
end
These can be used in a test which allows me to start a subprocess, capture the STDOUT and as soon as it matches the test Regular Expression, it passes, else it will wait 'till timeout and flunk (fail the test).
The loop will capture output and pass the test once it sees matching output. It uses a IO.pipe because that is most transparant for subprocesses (and their children) to write to.
I doubt this will work on Windows. And it needs some cleaning up of the wait_for_spawned_io which is doing slightly too much IMO. Antoher problem is that the Process.kill('INT') might not reach the children which are orphaned but still running after this test has ran. I need to find a way to ensure the entire subtree of processes is killed.

How do I make a Ruby script run once a second?

I have a Ruby script that needs to run about one time a second. I am using a Ruby script to keep track of modifications of files in a directory and want the script to track updates in "live" time.
Basically, I want my script to do the same kind of thing as running "top" on a Unix shell, where the screen is updated every second or so. Is there an equivalent to setInterval in Ruby like there is in JavaScript?
There are a few ways to do this.
The quick-and-dirty versions:
shell (kornish):
while :; do
my_ruby_script.rb
sleep 1
done
watch(1):
shell$ watch -n 1 my_ruby_script.rb
This will run your script every second and keep the output of the most recent run displayed in your terminal.
in ruby:
while true
do_my_stuff
sleep 1
end
These all suffer from the same issue: if the actual script/function takes time to run, it makes the loop run less than every second.
Here is a ruby function that will make sure the function is called (almost) exactly every second, as long as the function doesn't take longer than a second:
def secondly_loop
last = Time.now
while true
yield
now = Time.now
_next = [last + 1,now].max
sleep (_next-now)
last = _next
end
end
Use it like this:
secondly_loop { my_function }
You may find interesting this gem whenever
You can code repeating tasks this way:
every 1.second do
#your task
end
As stated in another answer, rb-inotify is well suited to this sort of thing. If you don't want to use it, then a simple approach is to use threads:
a = Thread.new { loop { some_method; Thread.stop } }
b = Thread.new { loop { sleep 1; break unless a.alive?; a.run } }
To stop polling, use a.kill or make sure that some_method kills its own thread with Thread.kill when some condition is met.
Using two threads like this ensures that some_method runs at least every second, regardless of the length of the operation, without having to do any time checking yourself (within the granularity of the thread scheduling, of course).
You might want to consider using something like rb-inotify to get notifications of changes of files. This way you can avoid "sleep" and keep the "live" feeling.
There is some useful information at the "Efficient Filesystem Handling" section of the Guard Gem documentation: https://github.com/guard/guard#efficient-filesystem-handling
Or you could use TimerTask from the Concurrent Ruby gem
timer_task = Concurrent::TimerTask.new(execution_interval: 1) do |task|
task.execution_interval.times{ print 'Boom! ' }
print "\n"
task.execution_interval += 1
if task.execution_interval > 5
puts 'Stopping...'
task.shutdown
end
end
timer_task.execute
Code extracted from the same link

How do I ensure only one instance of a Ruby script is running at a time?

I have a process that runs on cron every five minutes. Usually, it takes only a few seconds to run, but sometimes it takes several minutes. I want to ensure that only one version of this is running at a time.
I tried an obvious way...
File.open("/tmp/indexer_lock.tmp",'w') do |f|
exit unless f.flock(File::LOCK_EX)
end
...but it's not testing to see if it can get the lock, it's blocking until the lock is released.
Any idea what I'm missing? I'd rather not hack something using ps, but that's an alternative.
I know this is old, but for anyone interested, there's a non-blocking constant that you can pass to flock so that it returns instead of blocking.
File.new("/tmp/foo.lock").flock( File::LOCK_NB | File::LOCK_EX )
Update for slhck
flock will return true if this process received the lock, false otherwise. So to ensure just one process is running at a time you just want to try to get the lock, and exit if you weren't able to. It's as simple as putting an exit unless in front of the line of code I have above:
exit unless File.new("/tmp/foo.lock").flock( File::LOCK_NB | File::LOCK_EX )
Depending on your needs, this should work just fine and doesn't require creating another file anywhere.
exit unless DATA.flock(File::LOCK_NB | File::LOCK_EX)
# your script here
__END__
DO NOT REMOVE: required for the DATA object above.
Although this isn't directly answering your question, if I were you I'd probably write a daemon script (you could use http://daemons.rubyforge.org/)
You could have your indexer (assuming its indexer.rb) be run through a wrapper script named script/index for example:
require 'rubygems'
require 'daemons'
Daemons.run('indexer.rb')
And your indexer can do almost the same thing, except you specify a sleep interval
loop do
# code executing your indexing
sleep INDEXING_INTERVAL
end
This is how job processors in tandem with a queue server usually function.
You could create and delete a temporary file and check for existence of this file.
Please check the answer to this question :
one instance shell script
There's a lockfile gem for exactly this situation. I've used it before and it's dead simple.
If your using cron it might be easier to do something like this in the shell script that cron calls:
#!/usr/local/bin/bash
#
if ps -C $PROGRAM_NAME &> /dev/null ; then
: #Program is already running.. appropriate action can be performed here (kill it?)
else
#Program is not running.. launch it.
$PROGRAM_NAME
fi
Here's a one-liner that should work at the top of any Ruby script:
exit unless File.new(__FILE__)).tap {|f| f.autoclose = false}.flock(File::LOCK_NB | File::LOCK_EX)
There are two issues with the original code.
First, the reason it's blocking is that the call to #flock is missing File::LOCK_NB:
Don't block when locking. May be combined
with other lock options using logical or.
Second, if a File object is closed (whether at the end of an #open block as in the code above, via explicit #close, or implicitly auto-closed when the File is garbage-collected), the underlying file descriptor is closed and the lock is released. To prevent this you can set #autoclose =false.
Ok, working off notes from #shodanex's pointer, here's what I have. I rubied it up a little bit (though I don't know of a touch analogue in Ruby).
tmp_file = File.expand_path(File.dirname(__FILE__)) + "/indexer.lock"
if File.exists?(tmp_file)
puts "quitting"
exit
else
`touch #{tmp_file}`
end
.. do stuff ..
File.delete(tmp_file)
Can you not add File::LOCK_NB to your lock, to make it non-blocking (i.e. it fails if it can't get the lock)
That would work in C, Perl etc.
At a higher level, you might find the lock_method gem useful:
def the_method_my_cron_job_calls
# something really expensive
end
lock_method :the_method_my_cron_job_calls
It uses lockfiles stored on the local filesystem (what was being discussed above) by default, but you can also configure remote lock storage:
LockMethod.config.storage = Redis.new([...]) # a remote RedisToGo instance, perhaps?
Also...
def the_method_my_cron_job_calls
# something really expensive
end
lock_method :the_method_my_cron_job_calls, (60*60) # automatically expire lock after an hour

Create background process in windows without visible console window

How do I create a background process with Haskell on windows without a visible command window being created?
I wrote a Haskell program that runs backup processes periodically but every time I run it, a command window opens up to the top of all the windows. I would like to get rid of this window. What is the simplest way to do this?
You should really tell us how you are trying to do this currently, but on my system (using linux) the following snippet will run a command without opening a new terminal window. It should work the same way on windows.
module Main where
import System
import System.Process
import Control.Monad
main :: IO ()
main = do
putStrLn "Running command..."
pid <- runCommand "mplayer song.mp3" -- or whatever you want
replicateM_ 10 $ putStrLn "Doing other stuff"
waitForProcess pid >>= exitWith
Thanks for the responses so far, but I've found my own solution. I did try a lot of different things, from writing a vbs script as suggested to a standalone program called hstart. hstart worked...but it creates a separate process which I didn't like very much because then I can't kill it in the normal way. But I found a simpler solution that required simply Haskell code.
My code from before was a simple call to runCommand, which did popup the window. An alternative function you can use is runProcess which has more options. From peeking at the ghc source code file runProcess.c, I found that the CREATE_NO_WINDOW flag is set when you supply redirects for all of STDIN, STOUT, and STDERR. So that's what you need to do, supply redirects for those. My test program looks like:
import System.Process
import System.IO
main = do
inH <- openFile "in" ReadMode
outH <- openFile "out" WriteMode
runProcess "rsync.bat" [] Nothing Nothing (Just inH) (Just outH) (Just outH)
This worked! No command window again! A caveat is that you need an empty file for inH to read in as the STDIN eventhough in my situation it was not needed.
The simplest way I can think of is to run the rsync command from within a Windows Shell script (vbs or cmd).
I don't know anything about Haskell, but I had this problem in a C project a few months ago.
The best way to execute an external program without any windows popping up is to use the ShellExecuteEx() API function with the "open" verb. If ShellExecuteEx() is available to you in Haskell, then you should be able to achieve what you want.
The C code looks something like this:
SHELLEXECUTEINFO Info;
BOOL b;
// Execute it
memset (&Info, 0, sizeof (Info));
Info.cbSize = sizeof (Info);
Info.fMask = SEE_MASK_NOCLOSEPROCESS | SEE_MASK_FLAG_NO_UI;
Info.hwnd = NULL;
Info.lpVerb = "open";
Info.lpFile = "rsync.exe";
Info.lpParameters = "whatever parameters you like";
Info.lpDirectory = NULL;
Info.nShow = SW_HIDE;
b = ShellExecuteEx (&Info);
if (b)
{
// Looks good; if there is an instance, wait for it
if (Info.hProcess)
{
// Wait
WaitForSingleObject (Info.hProcess, INFINITE);
}
}

Resources