using shell operators with Scala process builder? - bash

I'm trying to run a set of shell commands using the Scala process builder. In Scala, I run the process builder like this:
val command : String = ... // loaded from file somewhere
val processBuilder = Process(command)
val exitCode : Integer = processBuilder.!
the commands are (ran one by one):
/usr/bin/R --slave --silent --file=test.R argval1 >> out
/usr/bin/R --slave --silent --file=test.R argval2 >> out
/usr/bin/R --slave --silent --file=test.R argval3 >> out
These three shell commands above will work without exceptions but the out file is never created. Then the following final command fails:
awk 'n < $0 {n=$0}END{print n}' out > final
basically it picks the smallest element of file out and puts it in file final. The awk command will fail with the following error while running it in command line works fine:
awk: syntax error at source line 1
context is
>>> ' <<<
awk: bailing out at source line 1

Those redirects are done by shell, and you are not running shell. maybe this would work better for you:
val processBuilder = Process("sh" :: "-c" :: command :: Nil)
Mind you, the process package let you redirect input and output directly, like this:
val processBuilder = Process(Seq("/usr/bin/R", "--slave", "--silent", "--file=test.R", "argval1")) #> new java.io.File("out")
Here I'm replacing a string with a Seq because that is generally a safer than letting Scala simply partition commands and arguments with spaces, since it doesn't recognize quotes.

The first option won't help if you need to run commands with |.

Related

Linux: command passed to tar in backticks is replaced on one system but not the other: what could cause that difference?

I am debugging a shell-script that works fine on one system but fails on another.
The script essentially unzips and untars archived log-files and then greps for a given substring in the contained log-files.
After some analysis and debugging I now found out that on one system an embedded `basename $TAR_FILENAME` command is properly executed (i.e. the command between the back-ticks is executed and the result replaces the part of the string between the back-ticks) while on the other system that replacement does NOT happen and instead the string `basename <filename-here>` (including the back-ticks) is inserted. This then of course derails the further processing of that string and the grep's don't work.
What could cause this? Can one enable/disable the back-tick feature in the bash?
I am not aware of any setting or switch that allows to toggle that feature on/off. Or is there?
Later addition:
This is the script:
#!/bin/bash
pattern=$1
for f in *.tar.gz; do
echo "$f:"
tar -xzf "$f" --to-command 'echo "f:`basename $TAR_FILENAME` s:'"$pattern\""
done
On one system this yields lines like:
f:localhost_access_log.2021-07-29.txt s:pattern
On the second this yields lines like:
f:`basename ./localhost_access_log.2021-07-29.txt` s:pattern
Both systems are on SLES-11 (very old, indeed...).
tar 1.26 passed the command to a shell (source):
argv[0] = "/bin/sh";
argv[1] = "-c";
argv[2] = to_command_option;
argv[3] = NULL;
priv_set_restore_linkdir ();
execv ("/bin/sh", argv);
tar 1.27 changed this to skip the shell as part of another fix (source):
if (wordsplit (cmd, &ws, (WRDSF_DEFFLAGS | WRDSF_ENV) & ~WRDSF_NOVAR))
FATAL_ERROR ((0, 0, _("cannot split string '%s': %s"),
cmd, wordsplit_strerror (&ws)));
execvp (ws.ws_wordv[0], ws.ws_wordv);
Since the shell is responsible for handling backticks, they will be interpreted in 1.26 but not 1.27.1.
The behavior was changed back for tar 1.29.

Bash - Read Directory Path From TXT, Append Executable, Then Execute

I am setting up a directory structure with many different R & bash scripts in it. They all will be referencing files and folders. Instead of hardcoding the paths I would like to have a text file where each script can search for a descriptor in the file (see below) and read the relevant path from that.
Getting the search-append to work in R is easy enough for me; I am having trouble getting it to work in Bash, since I don't know the language very well.
My guess is it has something to do with the way awk works / stores the variable, or maybe the way the / works on the awk output. But I'm not familiar enough with it and would really appreciate any help
Text File "Master_File.txt":
NOT_DIRECTORY "/file/paths/Fake"
JOB_TEST_DIRECTORY "/file/paths/Real"
ALSO_NOT_DIRECTORY "/file/paths/Fake"
Bash Script:
#! /bin/bash
master_file_name="Master_File.txt"
R_SCRIPT="RScript.R"
SRCPATH=$(awk '/JOB_TEST_DIRECTORY/ { print $2 }' $master_file_name)
Rscript --vanilla $SRCPATH/$R_SCRIPT
The last line, $SRCPATH/$R_SCRIPT, seems to be replacing part of SRCPath with the name of $R_SCRIPT which outputs something like /RScript.Rs/Real instead of what I would like, which is /file/paths/Real/RScript.R.
Note: if I hard code the path path="/file/paths/Real" then the code $path/$R_SCRIPT outputs what I want.
The R Script:
system(command = "echo \"SUCCESSFUL_RUN\"", intern = FALSE, wait = TRUE)
q("no")
Please let me know if there's any other info that would be helpful, I added everything I could think of. And thank you.
Edit Upon Answer:
I found two solutions.
Solution 1 - By Mheni:
[ see his answer below ]
Solution 2 - My Adaptation of Mheni's Answer:
After seeing a Mehni's note on ignoring the " quotation marks, I looked up some more stuff, and found out it's possible to change the character that awk used to determine where to separate the text. By adding a -F\" to the awk call, it successfully separates based on the " character.
The following works
#!/bin/bash
master_file_name="Master_File.txt"
R_SCRIPT="RScript.R"
SRCPATH=$(awk -F\" -v r_script=$R_SCRIPT '/JOB_TEST_DIRECTORY/ { print $2 }' $master_file_name)
Rscript --vanilla $SRCPATH/$R_SCRIPT
Thank you so much everyone that took the time to help me out. I really appreciate it.
the problem is because of the quotes around the path, this change to the awk command ignores them when printing the path.
there was also a space in the shebang line that shouldn't be there as #david mentioned
#!/bin/bash
master_file_name="/tmp/data"
R_SCRIPT="RScript.R"
SRCPATH=$(awk '/JOB_TEST_DIRECTORY/ { if(NR==2) { gsub("\"",""); print $2 } }' "$master_file_name")
echo "$SRCPATH/$R_SCRIPT"
OUTPUT
[1] "Hello World!"
in my example the paths are in /tmp/data
NOT_DIRECTORY "/tmp/file/paths/Fake"
JOB_TEST_DIRECTORY "/tmp/file/paths/Real"
ALSO_NOT_DIRECTORY "/tmp/file/paths/Fake"
and in the path that corresponds to JOB_TEST_DIRECTORY i have a simple hello_world R script
[user#host tmp]$ cat /tmp/file/paths/Real/RScript.R
print("Hello World!")
I would use
Master_File.txt :
NOT_DIRECTORY="/file/paths/Fake"
JOB_TEST_DIRECTORY="/file/paths/Real"
ALSO_NOT_DIRECTORY="/file/paths/Fake"
Bash Script:
#!/bin/bash
R_SCRIPT="RScript.R"
if [[ -r /path/to/Master_File.txt ]]; then
. /path/to/Master_File.txt
else
echo "ERROR -- Can't read Master_File"
exit
fi
Rscript --vanilla $JOB_TEST_DIRECTORY/$R_SCRIPT
Basically, you create a configuration file Key=value, source it then use the the keys as variable for whatever you need throughout the script.

GNUPLOT : A better way of piping data into gnuplot script

I have a gnuplot script like this (simplified)
reset session
set terminal pngcairo enhanced font "Times,25" size 800,400
filename = ifilename
stats filename nooutput
N = STATS_columns
M = STATS_records
set angles degrees
set size square 1.25,1
set output ofilename
# does some stuff
...
...
...
set parametric
plot \
for [i=2:N] filename u (posX($0, column(i))):(posY($0, column(i))) w p ps 1.2 pt 7 lc rgb lcolor(i-2)
What I want to do is define ifilename (input file) and ofilename (output file) via a shell script.
So I thought the -e command might just be the one for the job.
So for the gnuploat part of the script I wroth this
gnuplot -e "ifilename='data/points_data1.dat'; ofilename='plot1'" chart.gp
but it threw the error
"chart.gp" line 8: undefined variable: ifilename
which refers to this line
filename = ifilename
I thought maybe that's because it's having some trouble parsing two = signs so I removed that line and rewrote my shell script like this
gnuplot -e "filename='data/points_data1.dat'; ofilename='plot1'" chart.gp
but this time it threw the following error
"chart.gp" line 8: undefined variable: filename
What actually worked was this
echo "data/points_data$i.dat" | gnuplot chart.gp
where I replaced the line filename = ifilename with
FILE = system("read filename; echo $filename")
and every instance of filename with FILE in .gp script.
But I'm not sure how to use that syntax to also define the output file.
So I was wondering, is there a better way of piping shell input into gnuplot script?
Your original command almost worked. The invocation
gnuplot -e "ifilename='data/points_data1.dat'; ofilename='plot1'" chart.gp
correctly defined the input and output file names. But then you clobbered them inside the chart.gp script by issuing the command
reset session
which clears all variable definitions including the ones you specifically wanted. Remove that line from the script and you should be fine. If the intent of the "reset session" command was to make sure that no system-wide or private initialization file is used, then replace it with a "-d" on the command line:
gnuplot -d -e "ifilename='data/points_data1.dat'; ofilename='plot1'" chart.gp
FILE = system("read filename; echo $filename")
is actually fine.
If you want to pipe the output to some file you can just omit set output "something.png"
and instead you could just send the .png output directly to stdout by running a script like this
#!/usr/bin/env gnuplot
reset session
set terminal pngcairo enhanced font "Times,25" size 800,400
...
then you can pipe that output to a .png file like this
./chart.gp > mypng.png
So the final command would look something like this
echo "data/points_data$i.dat" | gnuplot chart.gp > plot$i.png

Redirect / pipe into read command

This is a follow-up to my previous question on SO. I am still trying to command a script deepScript from within another script shallowScript and process its output before display on terminal. Here is a code sample:
deepScript.sh
#!/bin/zsh
print "Hello - this is deepScript"
read "ans?Reading : "
print $ans
shallowScript.sh
#!/bin/zsh
function __process {
while read input; do
echo $input | sed "s/e/E/g"
done }
print "Hello - this is shallowScript"
. ./deepScript.sh |& __process
(edited : outcome of this syntax and of 2 alternatives pasted below)
[UPDATE]
I have tried alternative syntaxes for last redirection . ./deepScript.sh |& __process and each syntax has a different outcome, but of course none of them is the one I want. I'll just paste each syntax and the resulting output of ./shallowScript.sh (where I typed "input" when read was waiting for an input), together with my findings so far.
Option 1 : . ./deepScript.sh |& __process
From this link, it seems that . ./deepScript.sh is run from a subshell, but not __process. Output:
zsh : ./shallowScript.sh
Hello - this is shallowScript
HEllo - this is dEEpScript
input
REading : input
Basically, the first two lines are printed as expected, then instead of printing the prompt REading :, the script directly waits for the stdin input, and then prints the prompt and executes print $ans.
Option 2: __process < <(. ./deepScript.sh)
Zsh's manpage indicates that (. ./deepScript.sh) will run as a subprocess. To me, that looks similar to Option 1. Output:
Hello - this is shallowScript
Reading : HEllo - this is dEEpScript
input
input
So, within . ./deepScript.sh, it prints read's prompt (script line 3) before the print (script line 2). Strange.
Option 3: __process < =(. ./deepScript.sh)
According to the same manpage, (. ./deepScript.sh) here sends its output to a temp file, which is then injected in __process (I don't know if there is a subprocess or not). Output:
Hello - this is shallowScript
Reading : input
HEllo - this is dEEpScript
input
Again, deepScript's line 3 prints to the terminal before line 2, but now it waits for the read to be complete.
Two questions:
Should this be expected?
Is there a fix or a workaround?
The observed delay stems from two factors:
deepScript.sh and process run asynchronously
read reads a complete line before returning
deepScript.sh writes the prompt to standard error, but without a newline. It then waits for your input while process continues to wait for a full line to be written so that its call to read can finish.

How to call shell commands from Ruby

How do I call shell commands from inside of a Ruby program? How do I then get output from these commands back into Ruby?
This explanation is based on a commented Ruby script from a friend of mine. If you want to improve the script, feel free to update it at the link.
First, note that when Ruby calls out to a shell, it typically calls /bin/sh, not Bash. Some Bash syntax is not supported by /bin/sh on all systems.
Here are ways to execute a shell script:
cmd = "echo 'hi'" # Sample string that can be used
Kernel#` , commonly called backticks – `cmd`
This is like many other languages, including Bash, PHP, and Perl.
Returns the result (i.e. standard output) of the shell command.
Docs: http://ruby-doc.org/core/Kernel.html#method-i-60
value = `echo 'hi'`
value = `#{cmd}`
Built-in syntax, %x( cmd )
Following the x character is a delimiter, which can be any character.
If the delimiter is one of the characters (, [, {, or <,
the literal consists of the characters up to the matching closing delimiter,
taking account of nested delimiter pairs. For all other delimiters, the
literal comprises the characters up to the next occurrence of the
delimiter character. String interpolation #{ ... } is allowed.
Returns the result (i.e. standard output) of the shell command, just like the backticks.
Docs: https://docs.ruby-lang.org/en/master/syntax/literals_rdoc.html#label-Percent+Strings
value = %x( echo 'hi' )
value = %x[ #{cmd} ]
Kernel#system
Executes the given command in a subshell.
Returns true if the command was found and run successfully, false otherwise.
Docs: http://ruby-doc.org/core/Kernel.html#method-i-system
wasGood = system( "echo 'hi'" )
wasGood = system( cmd )
Kernel#exec
Replaces the current process by running the given external command.
Returns none, the current process is replaced and never continues.
Docs: http://ruby-doc.org/core/Kernel.html#method-i-exec
exec( "echo 'hi'" )
exec( cmd ) # Note: this will never be reached because of the line above
Here's some extra advice:
$?, which is the same as $CHILD_STATUS, accesses the status of the last system executed command if you use the backticks, system() or %x{}.
You can then access the exitstatus and pid properties:
$?.exitstatus
For more reading see:
http://www.elctech.com/blog/i-m-in-ur-commandline-executin-ma-commands
http://blog.jayfields.com/2006/06/ruby-kernel-system-exec-and-x.html
http://tech.natemurray.com/2007/03/ruby-shell-commands.html
Here's a flowchart based on "When to use each method of launching a subprocess in Ruby". See also, "Trick an application into thinking its stdout is a terminal, not a pipe".
The way I like to do this is using the %x literal, which makes it easy (and readable!) to use quotes in a command, like so:
directorylist = %x[find . -name '*test.rb' | sort]
Which, in this case, will populate file list with all test files under the current directory, which you can process as expected:
directorylist.each do |filename|
filename.chomp!
# work with file
end
Here's the best article in my opinion about running shell scripts in Ruby: "6 Ways to Run Shell Commands in Ruby".
If you only need to get the output use backticks.
I needed more advanced stuff like STDOUT and STDERR so I used the Open4 gem. You have all the methods explained there.
My favourite is Open3
require "open3"
Open3.popen3('nroff -man') { |stdin, stdout, stderr| ... }
Some things to think about when choosing between these mechanisms are:
Do you just want stdout or do you
need stderr as well? Or even
separated out?
How big is your output? Do you want
to hold the entire result in memory?
Do you want to read some of your
output while the subprocess is still
running?
Do you need result codes?
Do you need a Ruby object that
represents the process and lets you
kill it on demand?
You may need anything from simple backticks (``), system(), and IO.popen to full-blown Kernel.fork/Kernel.exec with IO.pipe and IO.select.
You may also want to throw timeouts into the mix if a sub-process takes too long to execute.
Unfortunately, it very much depends.
I'm definitely not a Ruby expert, but I'll give it a shot:
$ irb
system "echo Hi"
Hi
=> true
You should also be able to do things like:
cmd = 'ls'
system(cmd)
One more option:
When you:
need stderr as well as stdout
can't/won't use Open3/Open4 (they throw exceptions in NetBeans on my Mac, no idea why)
You can use shell redirection:
puts %x[cat bogus.txt].inspect
=> ""
puts %x[cat bogus.txt 2>&1].inspect
=> "cat: bogus.txt: No such file or directory\n"
The 2>&1 syntax works across Linux, Mac and Windows since the early days of MS-DOS.
The answers above are already quite great, but I really want to share the following summary article: "6 Ways to Run Shell Commands in Ruby"
Basically, it tells us:
Kernel#exec:
exec 'echo "hello $HOSTNAME"'
system and $?:
system 'false'
puts $?
Backticks (`):
today = `date`
IO#popen:
IO.popen("date") { |f| puts f.gets }
Open3#popen3 -- stdlib:
require "open3"
stdin, stdout, stderr = Open3.popen3('dc')
Open4#popen4 -- a gem:
require "open4"
pid, stdin, stdout, stderr = Open4::popen4 "false" # => [26327, #<IO:0x6dff24>, #<IO:0x6dfee8>, #<IO:0x6dfe84>]
If you really need Bash, per the note in the "best" answer.
First, note that when Ruby calls out to a shell, it typically calls /bin/sh, not Bash. Some Bash syntax is not supported by /bin/sh on all systems.
If you need to use Bash, insert bash -c "your Bash-only command" inside of your desired calling method:
quick_output = system("ls -la")
quick_bash = system("bash -c 'ls -la'")
To test:
system("echo $SHELL")
system('bash -c "echo $SHELL"')
Or if you are running an existing script file like
script_output = system("./my_script.sh")
Ruby should honor the shebang, but you could always use
system("bash ./my_script.sh")
to make sure, though there may be a slight overhead from /bin/sh running /bin/bash, you probably won't notice.
You can also use the backtick operators (`), similar to Perl:
directoryListing = `ls /`
puts directoryListing # prints the contents of the root directory
Handy if you need something simple.
Which method you want to use depends on exactly what you're trying to accomplish; check the docs for more details about the different methods.
Using the answers here and linked in Mihai's answer, I put together a function that meets these requirements:
Neatly captures STDOUT and STDERR so they don't "leak" when my script is run from the console.
Allows arguments to be passed to the shell as an array, so there's no need to worry about escaping.
Captures the exit status of the command so it is clear when an error has occurred.
As a bonus, this one will also return STDOUT in cases where the shell command exits successfully (0) and puts anything on STDOUT. In this manner, it differs from system, which simply returns true in such cases.
Code follows. The specific function is system_quietly:
require 'open3'
class ShellError < StandardError; end
#actual function:
def system_quietly(*cmd)
exit_status=nil
err=nil
out=nil
Open3.popen3(*cmd) do |stdin, stdout, stderr, wait_thread|
err = stderr.gets(nil)
out = stdout.gets(nil)
[stdin, stdout, stderr].each{|stream| stream.send('close')}
exit_status = wait_thread.value
end
if exit_status.to_i > 0
err = err.chomp if err
raise ShellError, err
elsif out
return out.chomp
else
return true
end
end
#calling it:
begin
puts system_quietly('which', 'ruby')
rescue ShellError
abort "Looks like you don't have the `ruby` command. Odd."
end
#output: => "/Users/me/.rvm/rubies/ruby-1.9.2-p136/bin/ruby"
We can achieve it in multiple ways.
Using Kernel#exec, nothing after this command is executed:
exec('ls ~')
Using backticks or %x
`ls ~`
=> "Applications\nDesktop\nDocuments"
%x(ls ~)
=> "Applications\nDesktop\nDocuments"
Using Kernel#system command, returns true if successful, false if unsuccessful and returns nil if command execution fails:
system('ls ~')
=> true
Don't forget the spawn command to create a background process to execute the specified command. You can even wait for its completion using the Process class and the returned pid:
pid = spawn("tar xf ruby-2.0.0-p195.tar.bz2")
Process.wait pid
pid = spawn(RbConfig.ruby, "-eputs'Hello, world!'")
Process.wait pid
The doc says: This method is similar to #system but it doesn't wait for the command to finish.
The easiest way is, for example:
reboot = `init 6`
puts reboot
The backticks (`) method is the easiest one to call shell commands from Ruby. It returns the result of the shell command:
url_request = 'http://google.com'
result_of_shell_command = `curl #{url_request}`
Given a command like attrib:
require 'open3'
a="attrib"
Open3.popen3(a) do |stdin, stdout, stderr|
puts stdout.read
end
I've found that while this method isn't as memorable as
system("thecommand")
or
`thecommand`
in backticks, a good thing about this method compared to other methods is
backticks don't seem to let me puts the command I run/store the command I want to run in a variable, and system("thecommand") doesn't seem to let me get the output whereas this method lets me do both of those things, and it lets me access stdin, stdout and stderr independently.
See "Executing commands in ruby" and Ruby's Open3 documentation.
If you have a more complex case than the common case that can not be handled with ``, then check out Kernel.spawn(). This seems to be the most generic/full-featured provided by stock Ruby to execute external commands.
You can use it to:
create process groups (Windows).
redirect in, out, error to files/each-other.
set env vars, umask.
change the directory before executing a command.
set resource limits for CPU/data/etc.
Do everything that can be done with other options in other answers, but with more code.
The Ruby documentation has good enough examples:
env: hash
name => val : set the environment variable
name => nil : unset the environment variable
command...:
commandline : command line string which is passed to the standard shell
cmdname, arg1, ... : command name and one or more arguments (no shell)
[cmdname, argv0], arg1, ... : command name, argv[0] and zero or more arguments (no shell)
options: hash
clearing environment variables:
:unsetenv_others => true : clear environment variables except specified by env
:unsetenv_others => false : dont clear (default)
process group:
:pgroup => true or 0 : make a new process group
:pgroup => pgid : join to specified process group
:pgroup => nil : dont change the process group (default)
create new process group: Windows only
:new_pgroup => true : the new process is the root process of a new process group
:new_pgroup => false : dont create a new process group (default)
resource limit: resourcename is core, cpu, data, etc. See Process.setrlimit.
:rlimit_resourcename => limit
:rlimit_resourcename => [cur_limit, max_limit]
current directory:
:chdir => str
umask:
:umask => int
redirection:
key:
FD : single file descriptor in child process
[FD, FD, ...] : multiple file descriptor in child process
value:
FD : redirect to the file descriptor in parent process
string : redirect to file with open(string, "r" or "w")
[string] : redirect to file with open(string, File::RDONLY)
[string, open_mode] : redirect to file with open(string, open_mode, 0644)
[string, open_mode, perm] : redirect to file with open(string, open_mode, perm)
[:child, FD] : redirect to the redirected file descriptor
:close : close the file descriptor in child process
FD is one of follows
:in : the file descriptor 0 which is the standard input
:out : the file descriptor 1 which is the standard output
:err : the file descriptor 2 which is the standard error
integer : the file descriptor of specified the integer
io : the file descriptor specified as io.fileno
file descriptor inheritance: close non-redirected non-standard fds (3, 4, 5, ...) or not
:close_others => false : inherit fds (default for system and exec)
:close_others => true : dont inherit (default for spawn and IO.popen)
This is not really an answer but maybe someone will find it useful:
When using TK GUI on Windows, and you need to call shell commands from rubyw, you will always have an annoying CMD window popping up for less then a second.
To avoid this you can use:
WIN32OLE.new('Shell.Application').ShellExecute('ipconfig > log.txt','','','open',0)
or
WIN32OLE.new('WScript.Shell').Run('ipconfig > log.txt',0,0)
Both will store the ipconfig output inside log.txt, but no windows will come up.
You will need to require 'win32ole' inside your script.
system(), exec() and spawn() will all pop up that annoying window when using TK and rubyw.
Not sure about shell commands. I used following for capturing system command's output into a variable val:
val = capture(:stdout) do
system("pwd")
end
puts val
shortened version:
val = capture(:stdout) { system("pwd") }
capture method is provided by active_support/core_ext/kernel/reporting.rb
Simlarly we can also capture standard errors too with :stderr
Here's a cool one that I use in a ruby script on OS X (so that I can start a script and get an update even after toggling away from the window):
cmd = %Q|osascript -e 'display notification "Server was reset" with title "Posted Update"'|
system ( cmd )
You can use format method as below to print some information:
puts format('%s', `ps`)
puts format('%d MB', (`ps -o rss= -p #{Process.pid}`.to_i / 1024))

Resources