Bash: losing quotation marks in expressions that contain variables after expansion [duplicate] - bash

This question already has answers here:
Why does shell ignore quoting characters in arguments passed to it through variables? [duplicate]
(3 answers)
Closed 4 years ago.
I just started learning Bash scripting and wrote a little something:
#!bin/bash
for str in stra strb strc
do
find . -name "*${str}*" | sort | cut -c3- > "${str}.list"
done
As you can see, I'm trying to create three files ("stra.list", "strb.list" and "strc.list") which would list the names of the files containing "stra", "strb", or "strc" respectively in the current directory. The cut -c3- hack is just for getting rid of the path name ./ at the beginning of find results.
But all my script does right now is creating three empty files...
So when I run
for str in stra strb strc;
do
echo "find .-name "*${str}*" | sort | cut -c3- > "${str}.list"";
done
I only see
find .-name *stra* | sort | cut -c3- > stra.list
find .-name *strb* | sort | cut -c3- > strb.list
find .-name *strc* | sort | cut -c3- > strc.list
So how can I retain the quotes around the expressions containing the variables after the expansion? I tried putting an extra set of quotes as well as using eval, but neither seems to work.
Update:
What I'm asking is how I can write my find command in such a way that Bash would successfully produce the three lists with the target file names, instead of for whatever reason just creating three blank lists. Sorry about the confusion.

If your goal is to generate valid shell commands, then simply trying to preserve quotes is doing it wrong (in the sense that it's actually insecure; maliciously generated variable contents can perform shell injection attacks). Instead, tell the shell itself to generate valid quoting for the content you want to preserve; printf %q will do this in bash.
This particular variant requires a new enough bash to have printf -v.
#!/bin/bash
for str in stra strb strc; do
printf -v start '%q ' find . -name "*${str}*"
printf -v end '%q ' "${str}.list"
printf '%s\n' "$start | sort | cut -c3- >$end"
done
By contrast, if you simply want to fix the bugs in your original script, assuming you have GNU find:
#!/bin/bash
for str in stra strb strc; do
find . -maxdepth 1 -name "*${str}*" -printf '%P\n' | sort > "${str}.list"
done
The find action -printf '%P\n' prints the name without the starting directory, meaning no ./ is present to need to be stripped.
Since you say you're only looking for files in the current directory, by the way, this whole mess is overkill. You don't need find for the job at all.
for str in stra strb strc; do
files=( *"$str"* )
[[ -e $files ]] && printf '%s\n' "${files[#]}" >"$str.list"
done
Note that output from this command can be misleading if any of your filenames contain literal newlines. (For this reason, storing filenames in newline-delimited files is a bad idea to start with).

There are two ways:
Enclose your echo output with single quotes instead of double quotes:
echo 'find .-name "*${str}*" | sort | cut -c3- > "${str}.list"';
or put a \ (backslash) in front of your inner quotes:
echo "find .-name \"*${str}*\" | sort | cut -c3- > \"${str}.list\"";

Related

Handle files with space in filename and output file names

I need to write a Bash script that achieve the following goals:
1) move the newest n pdf files from folder 1 to folder 2;
2) correctly handles files that could have spaces in file names;
3) output each file name in a specific position in a text file. (In my actual usage, I will use sed to put the file names in a specific position of an existing file.)
I tried to make an array of filenames and then move them and do text output in a loop. However, the following array cannot handle files with spaces in filename:
pdfs=($(find -name "$DOWNLOADS/*.pdf" -print0 | xargs -0 ls -1 -t | head -n$NUM))
Suppose a file has name "Filename with Space". What I get from the above array will have "with" and "Space" in separate array entries.
I am not sure how to avoid these words in the same filename being treated separately.
Can someone help me out?
Thanks!
-------------Update------------
Sorry for being vague on the third point as I thought I might be able to figure that out after achieving the first and second goals.
Basically, it is a text file that have a line start with "%comment" near the end and I will need to insert the filenames before that line in the format "file=PATH".
The PATH is the folder 2 that I have my pdfs moved to.
You can achieve this using mapfile in conjunction with gnu versions of find | sort | cut | head that have options to operate on NUL terminated filenames:
mapfile -d '' -t pdfs < <(find "$DOWNLOADS/*.pdf" -name 'file*' -printf '%T#:%p\0' |
sort -z -t : -rnk1 | cut -z -d : -f2- | head -z -n $NUM)
Commands used are:
mapfile -d '': To read array with NUL as delimiter
find: outputs each file's modification stamp in EPOCH + ":" + filename + NUL byte
sort: sorts reverse numerically on 1st field
cut: removes 1st field from output
head: outputs only first $NUM filenames
find downloads -name "*.pdf" -printf "%T# %p\0" |
sort -z -t' ' -k1 -n |
cut -z -d' ' -f2- |
tail -z -n 3
find all *.pdf files in downloads
for each file print it's modifition date %T with the format specifier # that means seconds since epoch with fractional part, then print space, filename and separate with \0
Sort the null separated stream using space as field separator using only first field using numerical sort
Remove the first field from the stream, ie. creation date, leaving only filenames.
Get the count of the newest files, in this example 3 newest files, by using tail. We could also do reverse sort and use head, no difference.
Don't use ls in scripts. ls is for nice formatted output. You could do xargs -0 stat --printf "%Y %n\0" which would basically move your script forward, as ls isn't meant to be used for scripts. Just that I couldn't make stat output fractional part of creation date.
As for the second part, we need to save the null delimetered list to a file
find downloads ........ >"$tmp"
and then:
str='%comment'
{
grep -B$((2**32)) -x "$str" "$out" | grep -v "$str"
# I don't know what you expect to do with newlines in filenames, but I guess you don't have those
cat "$tmp" | sed -z 's/^/file=/' | sed 's/\x0/\n/g'
grep -A$((2**32)) -x "$str" "$out"
} | sponge "$out"
where output is the output file name
assuming output file name is stored in variable "$out"
filter all lines before the %comment and remove the line %comment itself from the file
output each filename with file= on the beginning. I also substituted zeros for newlines.
the filter all lines after %comment including %comment itself
write the output for outfile. Remember to use a temporary file.
Don't use pdf=$(...) on null separated inputs. You can use mapfile to store that to an array, as other answers provided.
Then to move the files, do smth like
<"$tmp" xargs -0 -i mv {} "$outdir"
or faster, with a single move:
{ cat <"$tmp"; printf "%s\0" "$outdir"; } | xargs -0 mv
or alternatively:
<"$tmp" xargs -0 sh -c 'outdir="$1"; shift; mv "$#" "$outdir"' -- "$outdir"
Live example at turorialspoint.
I suppose following code will be close to what you want:
IFS=$'\n' pdfs=($(find -name "$DOWNLOADS/*.pdf" -print0 | xargs -0 -I ls -lt "{}" | tail -n +1 | head -n$NUM))
Then you can access the output through ${pdfs[0]}, ${pdfs[1]}, ...
Explanations
IFS=$'\n' makes the following line to be split only with "\n".
-I option for xargs tells xargs to substitute {} with filenames so it can be quoted as "{}".
tail -n +1 is a trick to suppress an error message saying "xargs: 'ls' terminated by signal 13".
Hope this helps.
Bash v4 has an option globstar, after enabling this option, we can use ** to match zero or more subdirectories.
mapfile is a built-in command, which is used for reading lines into an indexed array variable. -t option removes a trailing newline.
shopt -s globstar
mapfile -t pdffiles < <(ls -t1 **/*.pdf | head -n"$NUM")
typeset -p pdffiles
for f in "${pdffiles[#]}"; do
echo "==="
mv "${f}" /dest/path
sed "/^%comment/i${f}=/dest/path" a-text-file.txt
done

Get index of argument with xargs?

In bash, I have list of files all named the same (in different sub directories) and I want to order them by creation/modified time, something like this:
ls -1t /tmp/tmp-*/my-file.txt | xargs ...
I would like to rename those files with some sort of index or something so I can move them all into the same folder. My result would ideally be something like:
my-file0.txt
my-file1.txt
my-file2.txt
Something like that. How would I go about doing this?
You can just loop through these files and keep appending an incrementing counter to desired file name:
for f in /tmp/tmp-*/my-file.txt; do
fname="${f##*/}"
fname="${fname%.*}"$((i++)).txt
mv "$f" "/dest/dir/$fname"
done
EDIT: In order to sort listed files my modification time as is the case with ls -1t you can use this script:
while IFS= read -d '' -r f; do
f="${f#* }"
fname="${f##*/}"
fname="${fname%.*}"$((i++)).txt
mv "$f" "/dest/dir/$fname"
done < <(find /tmp/tmp-* -name 'my-file.txt' -printf "%T# %p\0" | sort -zk1nr)
This handles filenames with all special characters like white spaces, newlines, glob characters etc since we are ending each filename with NUL or \0 character in -printf option. Note that we are also using sort -z to handle NUL terminated data.
So I found an answer to my own question, thoughts on this one?
ls -1t /tmp/tmp-*/my-file.txt | awk 'BEGIN{ a=0 }{ printf "cp %s /tmp/all-the-files/my-file_%03d.txt\n", $0, a++ }' | bash;
I found this from another stack overflow question looking for something similar that my search didn't find at first. I was impressed with the awk line, thought that was pretty neat.

More universal alternative to this sed command?

I have a variable called $dirs storing directories in a dir tree:
root/animals/rats/mice
root/animals/cats
And I have another variable called $remove for example that holds the names of the directories I want to remove from the dirs variable:
rats
crabs
I am using a for loop to do that:
for d in $remove; do
dirs=$(echo "$dirs" | sed "/\b$d\b/d")
done
After that loop is done, what I should be left with is:
root/animals/cats
because the loop found rats.
I have tested this approach on 3 systems but it only works as expected on 2.
Is there a more universal approach that would work on all shells?
You are looking for something like
echo "${dirs}" | grep -Ev "rats|crabs"
When you can't store the exclusion list in the format with |, try to change it on the fly:"
echo "${dirs}" | grep -Ev $(echo "${remove}" | tr -s "\n" "|" | sed 's/|$//')
You can use the excludeFile technique without a temp file with
echo "${dirs}" | grep -vf <(echo "${remove}")
I am not sure which of there solutions will be best supported.

bash uses only first entry from find

I'm trying to list all PDF files under a given directory $1 (and its subdirectories), get the number of pages in each file and calculate two numbers using the pagecount. My script used to work, but only on filenames that don't contain spaces and only in one directory that is only filled with PDF files. I've modified it a bit already (using quotes around variables and such), but now I'm a bit stuck.
The problem I'm having is that, as it is now, the script only processes the first file found by find . -name '*.pdf'. How would I go about processing the rest?
#!/bin/bash
wd=`pwd`
pppl=0.03 #euro
pppnl=0.033 #eruo
cd $1
for entry in "`find . -name '*.pdf'`"
do
filename="$(basename "$entry")"
pagecount=`pdfinfo "$filename" | grep Pages | sed 's/[^0-9]*//'`
pricel=`echo "$pagecount * $pppl" | bc`
pricenl=`echo "$pagecount * $pppnl" | bc`
echo -e "$filename\t\t$pagecount\t$pricel\t$pricenl"
done
cd "$wd"
The problem with using find in a for loop, is that if you don't quote the command, the filenames with spaces will be split, and if you do quote the command, then the entire results will be parsed in a single iteration.
The workaround is to use a while loop instead, like this:
find . -name '*.pdf' -print0 | while IFS= read -r -d '' entry
do
....
done
Read this article for more discussion: http://mywiki.wooledge.org/ParsingLs
It's a bad idea to use word splitting. Use a while loop instead.
while read -r entry
do
filename=$(basename "$entry")
pagecount=$(pdfinfo "$filename" | grep Pages | sed 's/[^0-9]*//')
pricel=$(echo "$pagecount * $pppl" | bc)
pricenl=$(echo "$pagecount * $pppnl" | bc)
echo -e "$filename\t\t$pagecount\t$pricel\t$pricenl"
done < <(exec find . -name '*.pdf')
Also prefer $() over backticks when possible. You also don't need to place around "" variables or command substitutions when they are being used for assignment.
filename=$(basename "$entry")
As well could simply be just
filename=${entry##*/}

How can I get the output of a command into a bash variable?

I can't remember how to capture the result of an execution into a variable in a bash script.
Basically I have a folder full of backup files of the following format:
backup--my.hostname.com--1309565.tar.gz
I want to loop over a list of all files and pull the numeric part out of the filename and do something with it, so I'm doing this so far:
HOSTNAME=`hostname`
DIR="/backups/"
SUFFIX=".tar.gz"
PREFIX="backup--$HOSTNAME--"
TESTNUMBER=9999999999
#move into the backup dir
cd $DIR
#get a list of all backup files in there
FILES=$PREFIX*$SUFFIX
#Loop over the list
for F in $FILES
do
#rip the number from the filename
NUMBER=$F | sed s/$PREFIX//g | sed s/$SUFFIX//g
#compare the number with another number
if [ $NUMBER -lg $TESTNUMBER ]
#do something
fi
done
I know the "$F | sed s/$PREFIX//g | sed s/$SUFFIX//g" part rips the number correctly (though I appreciate there might be a better way of doing this), but I just can't remember how to get that result into NUMBER so I can reuse it in the if statement below.
Use the $(...) syntax (or ``).
NUMBER=$( echo $F | sed s/$PREFIX//g | sed s/$SUFFIX//g )
or
NUMBER=` echo $F | sed s/$PREFIX//g | sed s/$SUFFIX//g `
(I prefer the first one, since it is easier to see when multiple ones nest.)
Backticks if you want to be portable to older shells (sh):
NUMBER=`$F | sed s/$PREFIX//g | sed s/$SUFFIX//g`.
Otherwise, use NUMBER=$($F | sed s/$PREFIX//g | sed s/$SUFFIX//g). It's better and supports nesting more readily.

Resources