Process get terminated when I call system() with wait=FALSE - opencpu

I'm trying to process videos on OpenCPU and because they are very big I want to call the "FFmpeg" process using "system" and allow it to keep working until it's done.
But I need to get the temporary "file directory" created by OpenCPU so I can pull that directory until the video conversion is done.
In order to do that i call the system function with the parameter wait=FALSE as shown bellow
This work fine if I use the library(opencpu) on my machine, but when I move this into the production environment (Ubuntu 14.x) the system call get interrupted just after starting.
Is this something that can be fixed using opencpu.confg? Or is it a bug?
ffmpeg_exe <- "/usr/bin/ffmpeg" # Linux path
exec_convert <- paste0("( ",ffmpeg_exe,' -i ',input_file,' ',convert_command,' ',output_file, ' 2> PROCESS_OUTPUT.txt ; ls > PROCESS_DONE.txt ',")")
system(exec_convert, wait=FALSE)

I just found out that on linux, OpenCPU does not allow for this behavior, It kills all child processes when the request returns. This is on purpose. It doesn't want orphan processes to run potentially forever on the server, opencpu is not designed for that purpose.

Related

Bash - create multiple virtual guests in one loop

I'm working on a bash script (I just started learning bash) that involves creating virtual guests on a remote server. I do this by SSH'ing from server A to B and execute 2 different commands:
# create the images
$(ssh -n john#serverB.net "fallocate -l ${imgsize} /home/john/images/${imgname}")
and
# create the virtual machine
$(ssh -n john#serverB.net virt-install --bunch of options)
It is possible that these sets of commands have to be executed twice (if there need to be 2 virtual guests created) in a loop. When the second command is being run for the second time I sometimes get this error:
Domain installation still in progress.
This means I have to wait until the previous virtual guest is completed. How would I be able to do these operations in one loop? Can I run them asynchronously? Can I use threads? Or is there another way?
I have heard about the 'wait' command, but is that safe to use?
Check the man page for virt-install. You can use --wait=0 or --noautoconsole.
--wait=WAIT Amount of time to wait (in minutes) for a VM to complete its install. Without this option, virt-install will wait for the
console to close (not necessarily indicating the guest has shutdown),
or in the case of --noautoconsole, simply kick off the install and
exit. Any negative value will make virt-install wait indefinitely, a
value of 0 triggers the same results as noautoconsole. If the time
limit is exceeded, virt-install simply exits, leaving the virtual
machine in its current state.

Bash: Script output to terminal session stops but script finishes normal

I'm opening an ssh-session to a remote server and execute a larger (around 1000 lines) bash-script on the remote machine. It involves several very CPU-intensive calls which run for up to three minutes each. To track the scripts progress it echoes messages placed at several points in the script.
In general the script runs smoothly. From time to time the script runs trough (the resulting file on the remote machine is correct) but the output to the terminal stops. Ctrl-C doesn't help, no prompt, just a frozen session. top in a separate session shows normal execution of the script.
My question: How keep the session alive?
local machine:
$ sw_vers
ProductName: Mac OS X
ProductVersion: 10.9
BuildVersion: 13A603
remote machine:
$ lsb_release -d
Description: Ubuntu 12.04.3 LTS
Personally, I would recommend using screen or tmux on the remote terminal for exactly this reason.
Those apps will allow the remote process to continue even if your local SSH session times out.
http://www.bangmoney.org/presentations/screen.html
http://tmux.sourceforge.net/
Start a screen on the remote machine and run your command from it:
screen -S largeScript
And then
./yourLargeScript.sh
Whenever your ssh session gets frozen, you can kill it with ~.
If you ssh again, you can grab back your screen by:
screen -dr largeScript
Make it log to a file instead (perhaps via syslog), and tail that file from wherever is convenient for you. This also helps detach the script so you can run it headless, from a cron job, etc. Also, if the log file has read access for others, they too can monitor it.

makePSOCKcluster hangs on win x64 after calling system

I am experiencing a hard to debug problem with makePSOCKcluster from the parallel package on R x64 on Windows. It does not happen on R i386 on Windows, nor on any OSX or Linux. Unfortunately it does not happen consistently either, only occasionally and quite randomly.
What happens is that the makePSOCKcluster function times out and freezes the R session, but only if earlier in the session some (arbitrary) system() calls were performed. The video and script below illustrate the problem more clearly.
Some stuff I tried without success:
Disable antivirus/firewalls.
Waiting a couple of seconds between calling system and makePSOCKcluser.
Using different system calls.
How would I further narrow this down? Here the video and the script used in the video is:
cmd_exists <- function(command){
iswin <- identical(.Platform$OS.type, "windows");
if(iswin){
test <- suppressWarnings(try(system(command, intern=TRUE, ignore.stdout=TRUE, ignore.stderr=TRUE, show.output.on.console=FALSE), silent=TRUE));
} else {
test <- suppressWarnings(try(system(command, intern=TRUE, ignore.stdout=TRUE, ignore.stderr=TRUE), silent=TRUE));
}
!is(test, "try-error")
}
options(hasgit = cmd_exists("git --version"));
options(haspandoc = cmd_exists("pandoc --version"));
options(hastex = cmd_exists("texi2dvi --version"));
cluster <- parallel::makePSOCKcluster(1);
makePSOCKCluster, or more generally makeCluster, can hang for any number of reasons when creating the so-called worker processes, which involves starting new R sessions using the Rscript command that will execute the .slaveRSOCK function, which will create a socket connection back to the master and then execute the slaveLoop function where it will eventually execute the tasks sent to it by the master. Wehen something goes wrong when starting any of the worker processes, the master will hang while executing socketConnection, waiting for the worker to connect to it even though that worker may have died or never even been created successfully.
Using the outfile argument is great because it often reveals the error that causes the worker process to die and thus the master to hang. But if that reveals nothing, then go to manual mode. In manual mode, the master prints the command to start each worker instead of executing the command itself. It's more work, but it gives you complete control, and you can even debug into the workers if you need to.
Here's an example:
> library(parallel)
> cl <- makePSOCKcluster(1, manual=TRUE, outfile='log.txt')
Manually start worker on localhost with
'/usr/lib/R/bin/Rscript' -e 'parallel:::.slaveRSOCK()' MASTER=localhost
PORT=10187 OUT=log.txt TIMEOUT=2592000 METHODS=TRUE XDR=TRUE
Next open a new terminal window (command prompt, or whatever), and paste in that Rscript command. As soon as you've executed it, makePSOCKcluster should return since we only requested one worker. Of course, if something goes wrong, it won't return, but if you're lucky, you'll get an error message in your terminal window and you'll have an important clue that will hopefully lead to a solution to your problem. If you're not so lucky, the Rscript command will also hang, and you'll have to dive in even deeper.
To debug the worker, you don't execute the displayed Rscript command because you need an interactive session. Instead, you start an R session with a command such as:
$ R --vanilla --args MASTER=localhost PORT=10187 OUT=log.txt TIMEOUT=2592000 METHODS=TRUE XDR=TRUE
In that R session, you can put a breakpoint on the .slaveRSOCK function and then execute it:
> debug(parallel:::.slaveRSOCK)
> parallel:::.slaveRSOCK()
Now you can start stepping through the code, possibly setting breakpoints on the slaveLoop and makeSOCKmaster functions.

Running remotely Linux script from Windows and get execution result code

I have the current scenario to deal with:
I have to schedule the backup of my company's Linux-based server (under Suse Linux) with ARCServe R15 (installed on Windows 2003R2SP2).
I know I have the ability in my backup software (ARCServe) to add pre/post execution scripts to my backup-jobs.
If failure of the script, ARCServe would be specified NOT to run the backup-job, and if success, specified to be run. I have no problem with this.
The problem is, I want to make a windows script (to be launched by ARCServe) for executing a Linux script on the cluster:
- If this Linux-script fails, I want my windows-script to fail, so my backup job in ARCServe wouldn't run
- If the Linux-script success, I want my windows-script to end normally with error code 0, so my ARCServe job would run normally.
I've tried creating this batch file (let's call it HPC.bat):
echo ON
start /wait "C:\Program Files\PUTTY\plink.exe" -v -l root -i "C:\IST\admin\scripts\HPC\pri.ppk" [cluster_name] /appli/admin/backup_admin
exit %errorlevel%
If I manually launch this .bat by double-clicking on it, or launching it in a command prompt under Windows, it executes normally and then ends.
If I make it being launched by ARCServe, the script seems never to end.
My job stays in "waiting" status, it seems the execution code of the linux script isn't returned to my batch file, and this one doesn't close.
In my mind, what's happening is plink just opens the connection to the Linux, send the sript execution signal, and then close the connection, so the execution code can't be returned to the batch. Am I right ?
Is what I want to do possible or am I trying something impossible to do ?
So, do I have to proceed differently ?
Do I have to use PUTTY or CygWin instead of plink ?
Please, it's giving me headaches ...
If you install Cygwin, you could do it exactly like you can do it on Linux to Linux, i.e. remotely run a command with ssh someuser#remoteserver.com somecommand
This command will return with the same return code on the calling client, as the command exited with on the remote end. If you use SSH shared keys for authentication instead of passwords, it can also be scripted without user interaction.

Can a standalone ruby script (windows and mac) reload and restart itself?

I have a master-workers architecture where the number of workers is growing on a weekly basis. I can no longer be expected to ssh or remote console into each machine to kill the worker, do a source control sync, and restart. I would like to be able to have the master place a message out on the network that tells each machine to sync and restart.
That's where I hit a roadblock. If I were using any sane platform, I could just do:
exec('ruby', __FILE__)
...and be done. However, I did the following test:
p Process.pid
sleep 1
exec('ruby', __FILE__)
...and on Windows, I get one ruby instance for each call to exec. None of them die until I hit ^C on the window in question. On every platform I tried this on, it is executing the new version of the file each time, which I have verified this by making simple edits to the test script while the test marched along.
The reason I'm printing the pid is to double-check the behavior I'm seeing. On windows, I am getting a different pid with each execution - which I would expect, considering that I am seeing a new process in the task manager for each run. The mac is behaving correctly: the pid is the same for every system call and I have verified with dtrace that each run is trigging a call to the execve syscall.
So, in short, is there a way to get a windows ruby script to restart its execution so it will be running any code - including itself - that has changed during its execution? Please note that this is not a rails application, though it does use activerecord.
After trying a number of solutions (including the one submitted by Byron Whitlock, which ultimately put me onto the path to a satisfactory end) I settled upon:
IO.popen("start cmd /C ruby.exe #{$0} #{ARGV.join(' ')}")
sleep 5
I found that if I didn't sleep at all after the popen, and just exited, the spawn would frequently (>50% of the time) fail. This is not cross-platform obviously, so in order to have the same behavior on the mac:
IO.popen("xterm -e \"ruby blah blah blah\"&")
The classic way to restart a program is to write another one that does it for you. so you spawn a process to restart.exe <args>, then die or exit; restart.exe waits until the calling script is no longer running, then starts the script again.

Resources