I am trying to write a bash script that takes in a directory, reads each file in the directory, and then appends the first line of each file in that directory to a new file. When I hard-code the variables in my script, it works fine.
This works:
#!/bin/bash
rm /local/SomePath/multigene.firstline.btab
touch /local/SomePath/multigene.firstline.btab
btabdir=/local/SomePath/test/*
outfile=/local/SomePath/multigene.firstline.btab
for f in $btabdir
do
head -1 $f >> $outfile
done
This does not work:
#!/bin/bash
while getopts ":d:o:" opt; do
case ${opt} in
d) btabdir=$OPTARG;;
o) outfile=$OPTARG;;
esac
done
rm $outfile
touch $outfile
for f in $btabdir
do
head -1 $f >> $outfile
done
Here is how I call the script:
bash /local/SomePath/Scripts/btab.besthits.wBp-q_wBm-r.sh -d /local/SomePath/test/* -o /local/SomePath/out.test/multigene.firstline.btab
And here is what I get when I run it:
rm: missing operand
Try 'rm --help' for more information.
touch: missing file operand
Try 'touch --help' for more information.
/local/SomePath/Scripts/btab.besthits.wBp-q_wBm-r.sh: line 23: $outfile: ambiguous redirect
Any suggestions? I'd like to be able to use getopts so I can make the script more generic. Thanks!
You have to pay extra attention to quoting and globbing when writing bash scripts.
When you call the script with a glob (* here) it gets expanded and split into words by your shell. This happends before your script even gets executed.
If you for example do cat *.txt cat will get all .txt files in the directory as its arguments. It will be the same as calling cat afile.txt nextfile.txt (and so on). Cat will never see the asterisk.
In your script it means that the input -d /local/SomePath/test/* gets expanded som something like /local/SomePath/test/someFile /local/SomePath/test/someOtherFile /test/someThirdFile.
Subsequently getopts only takes the first file after -d as for $btabdir and the -o doesn't get handled in the case switch.
I suggest you start by quoting every variable, preferable in the "${name}" style, and only invoke the script with quoted input.
It might also be send in a directory path, test that it is a directory (test -d), and change your for loop to for f in "${btabdir}"/*
This also works:
head -n1 -q /local/SomePath/test/* >> /local/SomePath/out.test/multigene.firstline.btab
I think the right answer here is "don't do it that way." :-)
The reason your current script isn't working may be that the wildcard is expanded by your interactive shell, not by your script. Try running your command with an echo at the beginning of the line for a hint at what's really happening. Once getopts sees the second of the matched files in the glob, it stops processing options, so -o never gets read, and $outfile remains unset. And since you don't quote your variable in rm $outfile, it's as if you're running rm without options. Test the difference in your shell between rm alone and rm "".
Also, what happens to your for loop if there's a space in a filename? Since you have bash, you have arrays. And arrays are much better for processing lists of files.
Perhaps use something like this instead:
#!/bin/bash
# initialize an array
files=()
while getopts :d:o: opt; do
case "$opt" in
d)
if [[ ! -d "$OPTARG" ]]; then
printf 'ERROR: not a directory: %s\n' "$OPTARG" >&2
exit 65
fi
# add to the array
files+=( "$OPTARG"/* )
;;
o) outfile="$OPTARG" ;;
*)
printf 'ERROR: unknown option: %s\n' "$opt" >&2
exit 64
;;
esac
done
if ! rm -f "$outfile" && touch "$outfile"; then
printf 'ERROR: cannot create %s\n' "$outfile" >&2
exit 73
fi
for f in "${files[#]}"; do
read -r < "$f"
printf '%s\n' "$REPLY"
done > "$outfile"
Here are some highlights of the changes....
We're using arrays, of course. The array ${files[#]} will contain one-file-per-record, without relying on whitespace, so with proper quoting you'll avoid problems with special characters in filenames.
We test for more error conditions, and actually show errors and exit if we see them. (The exit values are sysexits.)
Instead of using head, we use read and a single redirect to $outfile. This saves multiple forks to an external program, and multiple fopen() calls to your output file.
Note that the argument to -d should be a directory, not a glob. And you can specify options multiple times. Multiple -d options will be added together, but only the last -o option will be used.
This question already has answers here:
How can I check if a program exists from a Bash script?
(39 answers)
Closed 4 years ago.
In a bash script, I need to determine whether an executable named foo is on the PATH.
You could also use the Bash builtin type -P:
help type
cmd=ls
[[ $(type -P "$cmd") ]] && echo "$cmd is in PATH" ||
{ echo "$cmd is NOT in PATH" 1>&2; exit 1; }
You can use which:
path_to_executable=$(which name_of_executable)
if [ -x "$path_to_executable" ] ; then
echo "It's here: $path_to_executable"
fi
TL;DR:
In bash:
function is_bin_in_path {
builtin type -P "$1" &> /dev/null
}
Example usage of is_bin_in_path:
% is_bin_in_path ls && echo "found in path" || echo "not in path"
found in path
In zsh:
Use whence -p instead.
For a version that works in both {ba,z}sh:
# True if $1 is an executable in $PATH
# Works in both {ba,z}sh
function is_bin_in_path {
if [[ -n $ZSH_VERSION ]]; then
builtin whence -p "$1" &> /dev/null
else # bash:
builtin type -P "$1" &> /dev/null
fi
}
To test that ALL given commands are executables in $PATH:
# True iff all arguments are executable in $PATH
function is_bin_in_path {
if [[ -n $ZSH_VERSION ]]; then
builtin whence -p "$1" &> /dev/null
else # bash:
builtin type -P "$1" &> /dev/null
fi
[[ $? -ne 0 ]] && return 1
if [[ $# -gt 1 ]]; then
shift # We've just checked the first one
is_bin_in_path "$#"
fi
}
Example usage:
is_bin_in_path ssh-agent ssh-add && setup_ssh_agent
Non-solutions to avoid
This is not a short answer because the solution must correctly handle:
Functions
Aliases
Builtin commands
Reserved words
Examples which fail with plain type (note the token after type changes):
$ alias foo=ls
$ type foo && echo "in path" || echo "not in path"
foo is aliased to `ls'
in path
$ type type && echo "in path" || echo "not in path"
type is a shell builtin
in path
$ type if && echo "in path" || echo "not in path"
if is a shell keyword
in path
Note that in bash, which is not a shell builtin (it is in zsh):
$ PATH=/bin
$ builtin type which
which is /bin/which
This answer says why to avoid using which:
Avoid which. Not only is it an external process you're launching for doing very little (meaning builtins like hash, type or command are way cheaper), you can also rely on the builtins to actually do what you want, while the effects of external commands can easily vary from system to system.
Why care?
Many operating systems have a which that doesn't even set an exit status, meaning the if which foo won't even work there and will always report that foo exists, even if it doesn't (note that some POSIX shells appear to do this for hash too).
Many operating systems make which do custom and evil stuff like change the output or even hook into the package manager.
In this case, also avoid command -v
The answer I just quoted from suggests using command -v, however this doesn't apply to the current "is the executable in $PATH?" scenario: it will fail in exactly the ways I've illustrated with plain type above.
Correct solutions
In bash we need to use type -P:
-P force a PATH search for each NAME, even if it is an alias,
builtin, or function, and returns the name of the disk file
that would be executed
In zsh we need to use whence -p:
-p Do a path search for name even if it is an alias,
reserved word, shell function or builtin.
You can use the command builtin, which is POSIX compatible:
if [ -x "$(command -v "$cmd")" ]; then
echo "$cmd is in \$PATH"
fi
The executable check is needed because command -v detects functions and aliases as well as executables.
In Bash, you can also use type with the -P option, which forces a PATH search:
if type -P "$cmd" &>/dev/null; then
echo "$cmd is in \$PATH"
fi
As already mentioned in the comments, avoid which as it requires launching an external process and might give you incorrect output in some cases.
if command -v foo ; then foo ; else echo "foo unavailable" ; fi
Use which
$ which myprogram
We can define a function for checking whether as executable exists by using which:
function is_executable() {
which "$#" &> /dev/null
}
The function is called just like you would call an executable. "$#" ensures that which gets exactly the same arguments as are given to the function.
&> /dev/null ensures that whatever is written to stdout or stderr by which is redirected to the null device (which is a special device which discards the information written to it) and not written to stdout or stderr by the function.
Since the function doesn't explicitly return with an return code, when it does return, the exit code of the latest executed executable—which in this case is which—will be the return code of the function. which will exit with a code that indicates success if it is able to find the executable specified by the argument to the function, otherwise with an exit code that indicates failure. This behavior will automatically be replicated by is_executable.
We can then use that function to conditionally do something:
if is_executable name_of_executable; then
echo "name_of_executable was found"
else
echo "name_of_executable was NOT found"
fi
Here, if executes the command(s) written between it and then—which in our case is is_executable name_of_executable—and chooses the branch to execute based on the return code of the command(s).
Alternatively, we can skip defining the function and use which directly in the if-statement:
if which name_of_executable &> /dev/null; then
echo "name_of_executable was found"
else
echo "name_of_executable was NOT found"
fi
However, I think this makes the code slightly less readable.
How would I validate that a program exists, in a way that will either return an error and exit, or continue with the script?
It seems like it should be easy, but it's been stumping me.
Answer
POSIX compatible:
command -v <the_command>
Example use:
if ! command -v <the_command> &> /dev/null
then
echo "<the_command> could not be found"
exit
fi
For Bash specific environments:
hash <the_command> # For regular commands. Or...
type <the_command> # To check built-ins and keywords
Explanation
Avoid which. Not only is it an external process you're launching for doing very little (meaning builtins like hash, type or command are way cheaper), you can also rely on the builtins to actually do what you want, while the effects of external commands can easily vary from system to system.
Why care?
Many operating systems have a which that doesn't even set an exit status, meaning the if which foo won't even work there and will always report that foo exists, even if it doesn't (note that some POSIX shells appear to do this for hash too).
Many operating systems make which do custom and evil stuff like change the output or even hook into the package manager.
So, don't use which. Instead use one of these:
command -v foo >/dev/null 2>&1 || { echo >&2 "I require foo but it's not installed. Aborting."; exit 1; }
type foo >/dev/null 2>&1 || { echo >&2 "I require foo but it's not installed. Aborting."; exit 1; }
hash foo 2>/dev/null || { echo >&2 "I require foo but it's not installed. Aborting."; exit 1; }
(Minor side-note: some will suggest 2>&- is the same 2>/dev/null but shorter – this is untrue. 2>&- closes FD 2 which causes an error in the program when it tries to write to stderr, which is very different from successfully writing to it and discarding the output (and dangerous!))
If your hash bang is /bin/sh then you should care about what POSIX says. type and hash's exit codes aren't terribly well defined by POSIX, and hash is seen to exit successfully when the command doesn't exist (haven't seen this with type yet). command's exit status is well defined by POSIX, so that one is probably the safest to use.
If your script uses bash though, POSIX rules don't really matter anymore and both type and hash become perfectly safe to use. type now has a -P to search just the PATH and hash has the side-effect that the command's location will be hashed (for faster lookup next time you use it), which is usually a good thing since you probably check for its existence in order to actually use it.
As a simple example, here's a function that runs gdate if it exists, otherwise date:
gnudate() {
if hash gdate 2>/dev/null; then
gdate "$#"
else
date "$#"
fi
}
Alternative with a complete feature set
You can use scripts-common to reach your need.
To check if something is installed, you can do:
checkBin <the_command> || errorMessage "This tool requires <the_command>. Install it please, and then run this tool again."
The following is a portable way to check whether a command exists in $PATH and is executable:
[ -x "$(command -v foo)" ]
Example:
if ! [ -x "$(command -v git)" ]; then
echo 'Error: git is not installed.' >&2
exit 1
fi
The executable check is needed because bash returns a non-executable file if no executable file with that name is found in $PATH.
Also note that if a non-executable file with the same name as the executable exists earlier in $PATH, dash returns the former, even though the latter would be executed. This is a bug and is in violation of the POSIX standard. [Bug report] [Standard]
Edit: This seems to be fixed as of dash 0.5.11 (Debian 11).
In addition, this will fail if the command you are looking for has been defined as an alias.
I agree with lhunath to discourage use of which, and his solution is perfectly valid for Bash users. However, to be more portable, command -v shall be used instead:
$ command -v foo >/dev/null 2>&1 || { echo "I require foo but it's not installed. Aborting." >&2; exit 1; }
Command command is POSIX compliant. See here for its specification: command - execute a simple command
Note: type is POSIX compliant, but type -P is not.
It depends on whether you want to know whether it exists in one of the directories in the $PATH variable or whether you know the absolute location of it. If you want to know if it is in the $PATH variable, use
if which programname >/dev/null; then
echo exists
else
echo does not exist
fi
otherwise use
if [ -x /path/to/programname ]; then
echo exists
else
echo does not exist
fi
The redirection to /dev/null/ in the first example suppresses the output of the which program.
I have a function defined in my .bashrc that makes this easier.
command_exists () {
type "$1" &> /dev/null ;
}
Here's an example of how it's used (from my .bash_profile.)
if command_exists mvim ; then
export VISUAL="mvim --nofork"
fi
Expanding on #lhunath's and #GregV's answers, here's the code for the people who want to easily put that check inside an if statement:
exists()
{
command -v "$1" >/dev/null 2>&1
}
Here's how to use it:
if exists bash; then
echo 'Bash exists!'
else
echo 'Your system does not have Bash'
fi
Try using:
test -x filename
or
[ -x filename ]
From the Bash manpage under Conditional Expressions:
-x file
True if file exists and is executable.
To use hash, as #lhunath suggests, in a Bash script:
hash foo &> /dev/null
if [ $? -eq 1 ]; then
echo >&2 "foo not found."
fi
This script runs hash and then checks if the exit code of the most recent command, the value stored in $?, is equal to 1. If hash doesn't find foo, the exit code will be 1. If foo is present, the exit code will be 0.
&> /dev/null redirects standard error and standard output from hash so that it doesn't appear onscreen and echo >&2 writes the message to standard error.
Command -v works fine if the POSIX_BUILTINS option is set for the <command> to test for, but it can fail if not. (It has worked for me for years, but I recently ran into one where it didn't work.)
I find the following to be more failproof:
test -x "$(which <command>)"
Since it tests for three things: path, existence and execution permission.
There are a ton of options here, but I was surprised no quick one-liners. This is what I used at the beginning of my scripts:
[[ "$(command -v mvn)" ]] || { echo "mvn is not installed" 1>&2 ; exit 1; }
[[ "$(command -v java)" ]] || { echo "java is not installed" 1>&2 ; exit 1; }
This is based on the selected answer here and another source.
If you check for program existence, you are probably going to run it later anyway. Why not try to run it in the first place?
if foo --version >/dev/null 2>&1; then
echo Found
else
echo Not found
fi
It's a more trustworthy check that the program runs than merely looking at PATH directories and file permissions.
Plus you can get some useful result from your program, such as its version.
Of course the drawbacks are that some programs can be heavy to start and some don't have a --version option to immediately (and successfully) exit.
Check for multiple dependencies and inform status to end users
for cmd in latex pandoc; do
printf '%-10s' "$cmd"
if hash "$cmd" 2>/dev/null; then
echo OK
else
echo missing
fi
done
Sample output:
latex OK
pandoc missing
Adjust the 10 to the maximum command length. It is not automatic, because I don't see a non-verbose POSIX way to do it:
How can I align the columns of a space separated table in Bash?
Check if some apt packages are installed with dpkg -s and install them otherwise.
See: Check if an apt-get package is installed and then install it if it's not on Linux
It was previously mentioned at: How can I check if a program exists from a Bash script?
I never did get the previous answers to work on the box I have access to. For one, type has been installed (doing what more does). So the builtin directive is needed. This command works for me:
if [ `builtin type -p vim` ]; then echo "TRUE"; else echo "FALSE"; fi
I wanted the same question answered but to run within a Makefile.
install:
#if [[ ! -x "$(shell command -v ghead)" ]]; then \
echo 'ghead does not exist. Please install it.'; \
exit -1; \
fi
It could be simpler, just:
#!/usr/bin/env bash
set -x
# if local program 'foo' returns 1 (doesn't exist) then...
if ! type -P foo; then
echo 'crap, no foo'
else
echo 'sweet, we have foo!'
fi
Change foo to vi to get the other condition to fire.
hash foo 2>/dev/null: works with Z shell (Zsh), Bash, Dash and ash.
type -p foo: it appears to work with Z shell, Bash and ash (BusyBox), but not Dash (it interprets -p as an argument).
command -v foo: works with Z shell, Bash, Dash, but not ash (BusyBox) (-ash: command: not found).
Also note that builtin is not available with ash and Dash.
zsh only, but very useful for zsh scripting (e.g. when writing completion scripts):
The zsh/parameter module gives access to, among other things, the internal commands hash table. From man zshmodules:
THE ZSH/PARAMETER MODULE
The zsh/parameter module gives access to some of the internal hash ta‐
bles used by the shell by defining some special parameters.
[...]
commands
This array gives access to the command hash table. The keys are
the names of external commands, the values are the pathnames of
the files that would be executed when the command would be in‐
voked. Setting a key in this array defines a new entry in this
table in the same way as with the hash builtin. Unsetting a key
as in `unset "commands[foo]"' removes the entry for the given
key from the command hash table.
Although it is a loadable module, it seems to be loaded by default, as long as zsh is not used with --emulate.
example:
martin#martin ~ % echo $commands[zsh]
/usr/bin/zsh
To quickly check whether a certain command is available, just check if the key exists in the hash:
if (( ${+commands[zsh]} ))
then
echo "zsh is available"
fi
Note though that the hash will contain any files in $PATH folders, regardless of whether they are executable or not. To be absolutely sure, you have to spend a stat call on that:
if (( ${+commands[zsh]} )) && [[ -x $commands[zsh] ]]
then
echo "zsh is available"
fi
The which command might be useful. man which
It returns 0 if the executable is found and returns 1 if it's not found or not executable:
NAME
which - locate a command
SYNOPSIS
which [-a] filename ...
DESCRIPTION
which returns the pathnames of the files which would
be executed in the current environment, had its
arguments been given as commands in a strictly
POSIX-conformant shell. It does this by searching
the PATH for executable files matching the names
of the arguments.
OPTIONS
-a print all matching pathnames of each argument
EXIT STATUS
0 if all specified commands are
found and executable
1 if one or more specified commands is nonexistent
or not executable
2 if an invalid option is specified
The nice thing about which is that it figures out if the executable is available in the environment that which is run in - it saves a few problems...
Use Bash builtins if you can:
which programname
...
type -P programname
For those interested, none of the methodologies in previous answers work if you wish to detect an installed library. I imagine you are left either with physically checking the path (potentially for header files and such), or something like this (if you are on a Debian-based distribution):
dpkg --status libdb-dev | grep -q not-installed
if [ $? -eq 0 ]; then
apt-get install libdb-dev
fi
As you can see from the above, a "0" answer from the query means the package is not installed. This is a function of "grep" - a "0" means a match was found, a "1" means no match was found.
This will tell according to the location if the program exist or not:
if [ -x /usr/bin/yum ]; then
echo "This is Centos"
fi
I'd say there isn't any portable and 100% reliable way due to dangling aliases. For example:
alias john='ls --color'
alias paul='george -F'
alias george='ls -h'
alias ringo=/
Of course, only the last one is problematic (no offence to Ringo!). But all of them are valid aliases from the point of view of command -v.
In order to reject dangling ones like ringo, we have to parse the output of the shell built-in alias command and recurse into them (command -v isn't a superior to alias here.) There isn't any portable solution for it, and even a Bash-specific solution is rather tedious.
Note that a solution like this will unconditionally reject alias ls='ls -F':
test() { command -v $1 | grep -qv alias }
If you guys/gals can't get the things in answers here to work and are pulling hair out of your back, try to run the same command using bash -c. Just look at this somnambular delirium. This is what really happening when you run $(sub-command):
First. It can give you completely different output.
$ command -v ls
alias ls='ls --color=auto'
$ bash -c "command -v ls"
/bin/ls
Second. It can give you no output at all.
$ command -v nvm
nvm
$ bash -c "command -v nvm"
$ bash -c "nvm --help"
bash: nvm: command not found
#!/bin/bash
a=${apt-cache show program}
if [[ $a == 0 ]]
then
echo "the program doesn't exist"
else
echo "the program exists"
fi
#program is not literal, you can change it to the program's name you want to check
The hash-variant has one pitfall: On the command line you can for example type in
one_folder/process
to have process executed. For this the parent folder of one_folder must be in $PATH. But when you try to hash this command, it will always succeed:
hash one_folder/process; echo $? # will always output '0'
I second the use of "command -v". E.g. like this:
md=$(command -v mkdirhier) ; alias md=${md:=mkdir} # bash
emacs="$(command -v emacs) -nw" || emacs=nano
alias e=$emacs
[[ -z $(command -v jed) ]] && alias jed=$emacs
I had to check if Git was installed as part of deploying our CI server. My final Bash script was as follows (Ubuntu server):
if ! builtin type -p git &>/dev/null; then
sudo apt-get -y install git-core
fi
To mimic Bash's type -P cmd, we can use the POSIX compliant env -i type cmd 1>/dev/null 2>&1.
man env
# "The option '-i' causes env to completely ignore the environment it inherits."
# In other words, there are no aliases or functions to be looked up by the type command.
ls() { echo 'Hello, world!'; }
ls
type ls
env -i type ls
cmd=ls
cmd=lsx
env -i type $cmd 1>/dev/null 2>&1 || { echo "$cmd not found"; exit 1; }
If there isn't any external type command available (as taken for granted here), we can use POSIX compliant env -i sh -c 'type cmd 1>/dev/null 2>&1':
# Portable version of Bash's type -P cmd (without output on stdout)
typep() {
command -p env -i PATH="$PATH" sh -c '
export LC_ALL=C LANG=C
cmd="$1"
cmd="`type "$cmd" 2>/dev/null || { echo "error: command $cmd not found; exiting ..." 1>&2; exit 1; }`"
[ $? != 0 ] && exit 1
case "$cmd" in
*\ /*) exit 0;;
*) printf "%s\n" "error: $cmd" 1>&2; exit 1;;
esac
' _ "$1" || exit 1
}
# Get your standard $PATH value
#PATH="$(command -p getconf PATH)"
typep ls
typep builtin
typep ls-temp
At least on Mac OS X v10.6.8 (Snow Leopard) using Bash 4.2.24(2) command -v ls does not match a moved /bin/ls-temp.
My setup for a Debian server:
I had the problem when multiple packages contained the same name.
For example apache2. So this was my solution:
function _apt_install() {
apt-get install -y $1 > /dev/null
}
function _apt_install_norecommends() {
apt-get install -y --no-install-recommends $1 > /dev/null
}
function _apt_available() {
if [ `apt-cache search $1 | grep -o "$1" | uniq | wc -l` = "1" ]; then
echo "Package is available : $1"
PACKAGE_INSTALL="1"
else
echo "Package $1 is NOT available for install"
echo "We can not continue without this package..."
echo "Exitting now.."
exit 0
fi
}
function _package_install {
_apt_available $1
if [ "${PACKAGE_INSTALL}" = "1" ]; then
if [ "$(dpkg-query -l $1 | tail -n1 | cut -c1-2)" = "ii" ]; then
echo "package is already_installed: $1"
else
echo "installing package : $1, please wait.."
_apt_install $1
sleep 0.5
fi
fi
}
function _package_install_no_recommends {
_apt_available $1
if [ "${PACKAGE_INSTALL}" = "1" ]; then
if [ "$(dpkg-query -l $1 | tail -n1 | cut -c1-2)" = "ii" ]; then
echo "package is already_installed: $1"
else
echo "installing package : $1, please wait.."
_apt_install_norecommends $1
sleep 0.5
fi
fi
}