Exactly why does result=$($shell_command) fail? - bash

Why does assigning command output work in some cases and seemingly not in others? I created a minimal script to show what I mean, and I run it in a directory with one other file in it, a.txt. Please see the ??? in the script below and let me know what's wrong, perhaps try it. Thanks.
#!/bin/bash
## setup so anyone can copy/paste/run this script ("complete" part of MCVE)
tempdir=$(mktemp -d "${TMPDIR:-/tmp}"/demo.XXXX) || exit # make a temporary directory
trap 'rm -rf "$tempdir"' 0 # delete temporary directory on exit
cd "$tempdir" || exit # don't risk changing non-temporary directories
touch a.txt # create a sample file
cmd1="find . -name 'a*' -print"
eval $cmd1 # this produces "./a.txt" as expected
res1=$($cmd1)
echo "res1=$res1" # ??? THIS PRODUCES ONLY "res1=" , $res1 is blank ???
# let's try this as a comparison
cmd2="ls a*"
res2=$($cmd2)
echo "res2=$res2" # this produces "res2=a.txt"

Let's look at exactly what this does:
cmd1="find . -name 'a*' -print"
res1=$($cmd1)
echo "res1=$res1" # ??? THIS PRODUCES ONLY "res1=" , $res1 is blank ???
As per BashFAQ #50, execution of res1=$($cmd1) does the following, assuming you have no files with names starting with 'a and ending with ' (yes, with single quotes as part of the name), and that you haven't enabled the nullglob shell option:
res1=$( find . -name "'a*'" -print )
Note the quoting around than name? That quoting represents that the 's are treated as data, rather than syntax; thus, rather than having any effect on whether the * is expanded, they're simply an additional element required to be part any filename for it to match, which is why you get a result with no matches at all. Instead, as the FAQ tells you, use a function:
cmd1() {
find . -name 'a*' -print
}
res1=$(cmd1)
...or an array:
cmd1=( find . -name 'a*' -print )
res1=$( "${cmd1[#]}" )
Now, why does this happen? Read the FAQ for a full explanation. In short: Parameter expansion happens after syntactic quotes have already been applied. This is actually a Very Good Thing from a security perspective -- if all expansions recursively ran through full parsing, it would be impossible to write secure code in bash handling hostile data.
Now, if you don't care about security, and you also don't care about best practices, and you also don't care about being able to correctly interpret results with unusual filenames:
cmd1="find . -name 'a*' -print"
res1=$(eval "$cmd1") # Force parsing process to restart from beginning. DANGEROUS if cmd1
# is not static (ie. constructed with user input or filenames);
# prone to being used for shell injection attacks.
echo "res1=$res1"
...but don't do that. (One can get away with sloppy practices only until one can't, and the point when one can't can be unpleasant; for the sysadmin staff at one of my former jobs, that point came when a backup-maintenance script deleted several TB worth of billing data because a buffer overflow had placed random garbage in the name of a file that was due to be deleted). Read the FAQ, follow the practices it contains.

Related

BASH Shell Find Multiple Files with Wildcard and Perform Loop with Action

I have a script that I call with an application, I can't run it from command line. I derive the directory where the script is called and in the next variable go up 1 level where my files are stored. From there I have 3 variables with the full path and file names (with wildcard), which I will refer to as "masks".
I need to find and "do something with" (copy/write their names to a new file, whatever else) to each of these masks. The do something part isn't my obstacle as I've done this fine when I'm working with a single mask, but I would like to do it cleanly in a single loop instead of duplicating loop and just referencing each mask separately if possible.
Assume in my $FILESFOLDER directory below that I have 2 existing files, aaa0.csv & bbb0.csv, but no file matching the ccc*.csv mask.
#!/bin/bash
SCRIPTFOLDER=${0%/*}
FILESFOLDER="$(dirname "$SCRIPTFOLDER")"
ARCHIVEFOLDER="$FILESFOLDER"/archive
LOGFILE="$SCRIPTFOLDER"/log.txt
FILES1="$FILESFOLDER"/"aaa*.csv"
FILES2="$FILESFOLDER"/"bbb*.csv"
FILES3="$FILESFOLDER"/"ccc*.csv"
ALLFILES="$FILES1
$FILES2
$FILES3"
#here as an example I would like to do a loop through $ALLFILES and copy anything that matches to $ARCHIVEFOLDER.
for f in $ALLFILES; do
cp -v "$f" "$ARCHIVEFOLDER" > "$LOGFILE"
done
echo "$ALLFILES" >> "$LOGFILE"
The thing that really spins my head is when I run something like this (I haven't done it with the copy command in place) that log file at the end shows:
filesfolder/aaa0.csv filesfolder/bbb0.csv filesfolder/ccc*.csv
Where I would expect echoing $ALLFILES just to show me the masks
filesfolder/aaa*.csv filesfolder/bbb*.csv filesfolder/ccc*.csv
In my "do something" area, I need to be able to use whatever method to find the files by their full path/name with the wildcard if at all possible. Sometimes my network is down for maintenance and I don't want to risk failing a change directory. I rarely work in linux (primarily SQL background) so feel free to poke holes in everything I've done wrong. Thanks in advance!
Here's a light refactoring with significantly fewer distracting variables.
#!/bin/bash
script=${0%/*}
folder="$(dirname "$script")"
archive="$folder"/archive
log="$folder"/log.txt # you would certainly want this in the folder, not $script/log.txt
shopt -s nullglob
all=()
for prefix in aaa bbb ccc; do
cp -v "$folder/$prefix"*.csv "$archive" >>"$log" # append, don't overwrite
all+=("$folder/$prefix"*.csv)
done
echo "${all[#]}" >> "$log"
The change in the loop to append the output or cp -v instead of overwrite is a bug fix; otherwise the log would only contain the output from the last loop iteration.
I would probably prefer to have the files echoed from inside the loop as well, one per line, instead of collect them all on one humongous line. Then you can remove the array all and instead simply
printf '%s\n' "$folder/$prefix"*.csv >>"$log"
shopt -s nullglob is a Bash extension (so won't work with sh) which says to discard any wildcard which doesn't match any files (the default behavior is to leave globs unexpanded if they don't match anything). If you want a different solution, perhaps see Test whether a glob has any matches in Bash
You should use lower case for your private variables so I changed that, too. Notice also how the script variable doesn't actually contain a folder name (or "directory" as we adults prefer to call it); fixing that uncovered a bug in your attempt.
If your wildcards are more complex, you might want to create an array for each pattern.
tmpspaces=(/tmp/*\ *)
homequest=($HOME/*\?*)
for file in "${tmpspaces[#]}" "${homequest[#]}"; do
: stuff with "$file", with proper quoting
done
The only robust way to handle file names which could contain shell metacharacters is to use an array variable; using string variables for file names is notoriously brittle.
Perhaps see also https://mywiki.wooledge.org/BashFAQ/020

Bash scripting print list of files

Its my first time to use BASH scripting and been looking to some tutorials but cant figure out some codes. I just want to list all the files in a folder, but i cant do it.
Heres my code so far.
#!/bin/bash
# My first script
echo "Printing files..."
FILES="/Bash/sample/*"
for f in $FILES
do
echo "this is $f"
done
and here is my output..
Printing files...
this is /Bash/sample/*
What is wrong with my code?
You misunderstood what bash means by the word "in". The statement for f in $FILES simply iterates over (space-delimited) words in the string $FILES, whose value is "/Bash/sample" (one word). You seemingly want the files that are "in" the named directory, a spatial metaphor that bash's syntax doesn't assume, so you would have to explicitly tell it to list the files.
for f in `ls $FILES` # illustrates the problem - but don't actually do this (see below)
...
might do it. This converts the output of the ls command into a string, "in" which there will be one word per file.
NB: this example is to help understand what "in" means but is not a good general solution. It will run into trouble as soon as one of the files has a space in its nameā€”such files will contribute two or more words to the list, each of which taken alone may not be a valid filename. This highlights (a) that you should always take extra steps to program around the whitespace problem in bash and similar shells, and (b) that you should avoid spaces in your own file and directory names, because you'll come across plenty of otherwise useful third-party scripts and utilities that have not made the effort to comply with (a). Unfortunately, proper compliance can often lead to quite obfuscated syntax in bash.
I think problem in path "/Bash/sample/*".
U need change this location to absolute, for example:
/home/username/Bash/sample/*
Or use relative path, for example:
~/Bash/sample/*
On most systems this is fully equivalent for:
/home/username/Bash/sample/*
Where username is your current username, use whoami to see your current username.
Best place for learning Bash: http://www.tldp.org/LDP/abs/html/index.html
This should work:
echo "Printing files..."
FILES=(/Bash/sample/*) # create an array.
# Works with filenames containing spaces.
# String variable does not work for that case.
for f in "${FILES[#]}" # iterate over the array.
do
echo "this is $f"
done
& you should not parse ls output.
Take a list of your files)
If you want to take list of your files and see them:
ls ###Takes list###
ls -sh ###Takes list + File size###
...
If you want to send list of files to a file to read and check them later:
ls > FileName.Format ###Takes list and sends them to a file###
ls > FileName.Format ###Takes list with file size and sends them to a file###

How to delete files like 'Incoming11781rKD'

I have a programme that is generating files like this "Incoming11781Arp", and there is always Incoming, and there is always 5 numbers, but there are 3 letters/upper-case/lower-case/numbers/special case _ in any way. Like Incoming11781_pi, or Incoming11781rKD.
How can I delete them using a script run from a cron job please? I've tried -
#!/bin/bash
file=~/Mail/Incoming******
rm "$file";
but it failed saying that there was no matching file or directory.
You mustn't double-quote the variable reference for pathname expansion to occur - if you do, the wildcard characters are treated as literals.
Thus:
rm $file
Caveat: ~/Mail/Incoming****** doesn't work the way you think it does and will potentially match more files than intended, as it is equivalent to ~/Mail/Incoming*, meaning that any file that starts with Incoming will match.
To only match files starting with Incoming that are followed by exactly 6 characters, use ~/Mail/Incoming??????, as #Jidder suggests in a comment.
Note that you could make your glob (pattern) even more specific:
file=~/Mail/Incoming[0-9][0-9][0-9][0-9][0-9][[:alpha:]_][[:alpha:]_][[:alpha:]_]
See the bash manual for a description of pathname expansion and pattern syntax: http://www.gnu.org/software/bash/manual/bashref.html#index-pathname-expansion.
You can achieve the same effect with the find command...
$ directory='~/Mail/'
$ file_pattern='Incoming*'
$ find "${directory}" -name "${file_pattern}" -delete
The first two lines define the directory and the file pattern separately, the find command will then proceed to delete any matching files inside that directory.

Writing a shell conditional for extensions

I'm writing a quick shell script to build and execute my programs in one fell swoop.
I've gotten that part down, but I'd like to include a little if/else to catch bad extensions - if it's not an .adb (it's an Ada script), it won't let the rest of the program execute.
My two-part question is:
How do I grab just the extension? Or is it easier to just say *.adb?
What would the if/else statement look like? I have limited experience in Bash so I understand that's a pretty bad question.
Thanks!
There are ways to extract the extension, but you don't really need to:
if [[ $filename == *.adb ]] ; then
. . . # this code is run if $filename ends in .adb
else
. . . # this code is run otherwise
fi
(The trouble with extracting the extension is that you'd have to define what you mean by "extension". What is the extension of a file named foo? How about a file named report.2012.01.29? So general-purpose extension-extracting code is tricky, and not worth it if your goal is just to confirm that file has a specific extension.)
There are multiple ways to do it. Which is best depends in part on what the subsequent operations will be.
Given a variable $file, you might want to test what the extension is. In that case, you probably do best with:
extn=${file##*.}
This deletes everything up to the last dot in the name, slashes and all, leaving you with adb if the file name was adafile.adb.
If, on the other hand, you want to do different things depending on the extension, you might use:
case "$file" in
(*.adb) ...do things with .adb files;;
(*.pqr) ...do things with .pqr files;;
(*) ...cover the rest - maybe an error;;
esac
If you want the name without the extension, you can do things the more traditional way with:
base=$(basename $file .adb)
path=$(dirname $file)
The basename command gives you the last component of the file name with the extension .adb stripped off. The dirname command gives you the path leading to the last component of the file name, defaulting to . (the current directory) if there is no specified path.
The more recent way to do those last two operations is:
base=${file##/}
path=${file%/*}
The advantage of these is that they are built-in operations that do not invoke a separate executable, so they are quicker. The disadvantage of the built-ins is that if you have a name that ends with a slash, the built-in treats it as significant but the command does not (and the command is probably giving you the more desirable behaviour, unless you want to argue GIGO).
There are other techniques available too. The expr command is an old, rather heavy-weight mechanism that would not normally be used (but it is very standard). There may be other techniques using the (( ... )), $(( ... )) and [[ ... ]] operators to evaluate various sorts of expression.
To get just the extension from the file path and name, use parameter expansion:
${filename##*.} # deletes everything to the last dot
To compare it with the string adb, just do
if [[ ${filename##*.} != adb ]] ; then
echo Invalid extension at "$filename".
exit 1
fi
or, using 'else`:
if [[ ${filename##*.} != adb ]] ; then
echo Invalid extension at "$filename".
else
# Run the script...
fi
Extension:
fileext=`echo $filename | sed 's_.*\.__'`
Test
if [[ x"${fileext}" = "xadb" ]] ; then
#do something
fi

How to `ls` only one level deep?

I have lots subdirectories containing data, and I want a short list of which jobs (subdirectories) I have. I'm not happy with the following command.
$ ls H2*
H2a:
energy.dat overlap.dat
norm.dat zdip.dat ...
(much more)
H2b:
energy.dat overlap.dat
norm.dat zdip.dat ...
(much more)
This needless clutter defeats the purpose of the wildcard (limiting the output). How can I limit the output to one level deep? I'd like to see the following output
H2a/ H2b/ H2z/
Thanks for your help,
Nick
Try this
ls -d H2*/
The -d option is supposed to list "directories only", but by itself just lists
.
which I personally find kind of strange. The wildcard is needed to get an actual list of directories.
UPDATE: As #Philipp points out, you can do this even more concisely and without leaving bash by saying
echo H2*/
The difference is that ls will print the items on separate lines, which is often useful for piping to other functions.
You should consider using find, like this:
find . -maxdepth 1 -type d -name "H2*"
NOTE: Putting "-type d" before "-maxdepth 1" results in a warning on Debian Linux ("find: warning: you have specified the global option -maxdepth after the argument -type, but global options are not positional, i.e., -maxdepth affects tests specified before it as well as those specified after it. Please specify global options before other arguments.") No such warning is issued on Mac.
echo H2*
It's Bash who does the expansion, so you don't even need ls.
Should you have both files and directories starting with H2, you can append a slash to restrict the glob to directories:
echo H2*/
Perhaps this is what you are looking for?
ls | grep H2*
Use tree by Steve Baker at http://mama.indstate.edu/users/ice/tree/
It fills in for a lot of things that are missing from ls.
To list directories one layer deep:
tree -adi -L 1 H2*

Resources