Loop through all the files with .txt extension in bash [duplicate] - bash

This question already has answers here:
Loop through all the files with a specific extension
(7 answers)
Closed 4 years ago.
I am trying to loop over files in a folder and test for .txt extensions.
But I get the following error: "awk: cannot open = (No such file or directory)
Here's my code:
!/bin/bash
files=$(ls);
for file in $files
do
# extension=$($file | awk -F . '{ print $NF }');
if [ $file | awk -F . "{ print $NF }" = txt ]
then
echo $file;
else
echo "Not a .txt file";
fi;
done;

The way you are doing this is wrong in many ways.
You should never parse output of ls. It does not handle the filename containing special characters intuitively See Why you shouldn't parse the output of ls(1)
Don't use variables to store multi-line data. The output of ls in a variable is expected to undergo word splitting. In your case files is being referenced as a plain variable, and without a delimiter set, you can't go through the multiple files stored.
Using awk is absolutely unnecessary here, the part $file | awk -F . "{ print $NF }" = txt is totally wrong, you are not passing the name the file to the pipe, just the variable $file, it should have been echo "$file"
The right interpreter she-bang should have been set as #!/bin/bash in your script if you were planning to run it as an executable, i.e. ./script.sh. The more recommended way would be to say #!/usr/bin/env bash to let the shell identify the default version of the bash installed.
As such your requirement could be simply reduced to
for file in *.txt; do
[ -f "$file" ] || continue
echo "$file"
done
This is a simple example using a glob pattern using *.txt which does pathname expansion on the all the files ending with the txt format. Before the loop is processed, the glob is expanded as the list of files i.e. assuming the folder has files as 1.txt, 2.txt and foo.txt, the loop is generated to
for file in 1.txt 2.txt foo.txt; do
Even in the presence of no files, i.e. when the glob matches empty (no text files found), the condition [ -f "$file" ] || continue would ensure the loop is exit gracefully by checking if the glob returned any valid file results or just an un-expanded string. The condition [ -f "$file" ] would fail for everything if except a valid file argument.
Or if you are targeting scripts for bourne again shell, enable glob options to remove non-matching globs, rather than preserving them
shopt -s nullglob
for file in *.txt; do
echo "$file"
done
Another way using shell array to store the glob results and parse them over later to do a specific action on them. This way is useful when doing a list of files as an argument list to another command. Using a proper quoted expansion "${filesList[#]}" will preserve the spacing/tabs/newlines and other meta characters in filenames.
shopt -s nullglob
filesList=(*.txt)
for file in "${filesList[#]}"; do
echo "$file"
done

Related

Bash File names will not append to file from script

Hello I am trying to get all files with Jane's name to a separate file called oldFiles.txt. In a directory called "data" I am reading from a list of file names from a file called list.txt, from which I put all the file names containing the name Jane into the files variable. Then I'm trying to test the files variable with the files in list.txt to ensure they are in the file system, then append the all the files containing jane to the oldFiles.txt file(which will be in the scripts directory), after it tests to make sure the item within the files variable passes.
#!/bin/bash
> oldFiles.txt
files= grep " jane " ../data/list.txt | cut -d' ' -f 3
if test -e ~data/$files; then
for file in $files; do
if test -e ~/scripts/$file; then
echo $file>> oldFiles.txt
else
echo "no files"
fi
done
fi
The above code gets the desired files and displays them correctly, as well as creates the oldFiles.txt file, but when I open the file after running the script I find that nothing was appended to the file. I tried changing the file assignment to a pointer instead files= grep " jane " ../data/list.txt | cut -d' ' -f 3 ---> files=$(grep " jane " ../data/list.txt) to see if that would help by just capturing raw data to write to file, but then the error comes up "too many arguments on line 5" which is the 1st if test statement. The only way I get the script to work semi-properly is when I do ./findJane.sh > oldFiles.txt on the shell command line, which is me essentially manually creating the file. How would I go about this so that I create oldFiles.txt and append to the oldFiles.txt all within the script?
The biggest problem you have is matching names like "jane" or "Jane's", etc. while not matching "Janes". grep provides the options -i (case insensitive match) and -w (whole-word match) which can tailor your search to what you appear to want without having to use the kludge (" jane ") of appending spaces before an after your search term. (to properly do that you would use [[:space:]]jane[[:space:]])
You also have the problem of what is your "script dir" if you call your script from a directory other than the one containing your script, such as calling your script from your $HOME directory with bash script/findJane.sh. In that case your script will attempt to append to $HOME/oldFiles.txt. The positional parameter $0 always contains the full pathname to the current script being run, so you can capture the script directory no matter where you call the script from with:
dirname "$0"
You are using bash, so store all the filenames resulting from your grep command in an array, not some general variable (especially since your use of " jane " suggests that your filenames contain whitespace)
You can make your script much more flexible if you take the information of your input file (e.g list.txt), the term to search for (e.g. "jane"), the location where to check for existence of the files (e.g. $HOME/data) and the output filename to append the names to (e.g. "oldFile.txt") as command line [positonal] parameters. You can give each default values so it behaves as you currently desire without providing any arguments.
Even with the additional scripting flexibility of taking the command line arguments, the script actually has fewer lines simply filling an array using mapfile (synonymous with readarray) and then looping over the contents of the array. You also avoid the additional subshell for dirname with a simple parameter expansion and test whether the path component is empty -- to replace with '.', up to you.
If I've understood your goal correctly, you can put all the pieces together with:
#!/bin/bash
# positional parameters
src="${1:-../data/list.txt}" # 1st param - input (default: ../data/list.txt)
term="${2:-jane}" # 2nd param - search term (default: jane)
data="${3:-$HOME/data}" # 3rd param - file location (defaut: ../data)
outfn="${4:-oldFiles.txt}" # 4th param - output (default: oldFiles.txt)
# save the path to the current script in script
script="$(dirname "$0")"
# if outfn not given, prepend path to script to outfn to output
# in script directory (if script called from elsewhere)
[ -z "$4" ] && outfn="$script/$outfn"
# split names w/term into array
# using the -iw option for case-insensitive whole-word match
mapfile -t files < <(grep -iw "$term" "$src" | cut -d' ' -f 3)
# loop over files array
for ((i=0; i<${#files[#]}; i++)); do
# test existence of file in data directory, redirect name to outfn
[ -e "$data/${files[i]}" ] && printf "%s\n" "${files[i]}" >> "$outfn"
done
(note: test expression and [ expression ] are synonymous, use what you like, though you may find [ expression ] a bit more readable)
(further note: "Janes" being plural is not considered the same as the singular -- adjust the grep expression as desired)
Example Use/Output
As was pointed out in the comment, without a sample of your input file, we cannot provide an exact test to confirm your desired behavior.
Let me know if you have questions.
As far as I can tell, this is what you're going for. This is totally a community effort based on the comments, catching your bugs. Obviously credit to Mark and Jetchisel for finding most of the issues. Notable changes:
Fixed $files to use command substitution
Fixed path to data/$file, assuming you have a directory at ~/data full of files
Fixed the test to not test for a string of files, but just the single file (also using -f to make sure it's a regular file)
Using double brackets — you could also use double quotes instead, but you explicitly have a Bash shebang so there's no harm in using Bash syntax
Adding a second message about not matching files, because there are two possible cases there; you may need to adapt depending on the output you're looking for
Removed the initial empty redirection — if you need to ensure that the file is clear before the rest of the script, then it should be added back, but if not, it's not doing any useful work
Changed the shebang to make sure you're using the user's preferred Bash, and added set -e because you should always add set -e
#!/usr/bin/env bash
set -e
files=$(grep " jane " ../data/list.txt | cut -d' ' -f 3)
for file in $files; do
if [[ -f $HOME/data/$file ]]; then
if [[ -f $HOME/scripts/$file ]]; then
echo "$file" >> oldFiles.txt
else
echo "no matching file"
fi
else
echo "no files"
fi
done

How to compare file names which is in format of timestamp and retrieve the file which has latest date?

I am very new to shell scripting, I have a scenario where i have many of files which is inside a folder and which has a naming convention such as (test-2020-11-19-1652.tgz - yyyy-mm-dd-hhmm), i need to compare the date(need to get from file name) and pick the latest one and need to unzip them and need to rename that particular file. i tried in many ways but end-up with an error due to beginner level.can anyone help me with this?
Expectation
In this above case i need to pick file shop_db-2020-11-19-1652.tgz because it is the latest file in the folder.and need to unzip it and rename it it shop_db
Expanding a file pattern always returns a sorted list. So it makes it possible to extract the ultimate entry you want with:
Using POSIX shell syntax:
#!/usr/bin/env sh
last() {
shift $(($# - 1))
printf %s "$1"
}
lastfile=$(last shop_db*.tgz)
if [ "$lastfile" = 'shop_db*.tgz' ]; then
lastfile=
fi
shift $(($# - 1)): Shift all arguments away except the last one.
printf %s "$1": Print the last argument since there is only one left.
Using Bash syntax:
#!/usr/bin/env bash
shopt -s nullglob
lastfile=$(printf '%s\0' shop_db*.tgz | tail -z -n1 | tr -d \\0)
shopt -s nullglob: A Bash feature to return an empty list if no file matches the pattern.
printf '%s\0' shop_db*.tgz: Print a null delimited list of files matching the shop_db*.tgz globbing pattern.
| tail -z -n1: Extract the last record from this null delimited list.
Alternate method using only Bash built-in:
#!/usr/bin/env bash
shopt -s nullglob
while read -r -d '' f && [ "$f" ]
do
lastfile=$f
done < <(
printf '%s\0' shop_db*.tgz
)
echo "$lastfile"
And finally expanding the globbing pattern into an array, and extracting the last index:
#!/usr/bin/env bash
shopt -s nullglob
array=(shop_db*.tgz)
if [ ${#array[#]} -gt 0 ]
then
lastfile=${array[-1]}
fi
echo "$lastfile"
What you require can actually be achieved by utilising awk with find:
find . -name "shop_db*.tgz" | awk 'END { "tar -xvf "$0|getline fil;system("mv "fil" shop_db") }'
Look for files starting with shop_db and ending with .tgz. Pipe the output into awk and un tar the file in verbose mode, reading the uncompressed file name into the variable fil using awk's getline. Utilise this fil variable to run the necessary move command to rename the file to shop_db using awk's system function. As the output from find will already be ordered, we utilise the END block within awk to process the last compressed file piped from find.

Checking if substring is in filename in bash

I'm trying to create a script that identifies the names of files in a directory and then checks to see if a string is a substring of the name. I'm doing this in bash and cannot use the grep command. Any thoughts?
I have the following code to check if a user submission matches a file name or a string in the name.
read -p name
for file in sample/*; do
echo $(basename "$file")
if [[$(basename "$file") ~= $name]];
then echo "invalid"
fi
done
You can just interpolate the user input into the wildcard.
printf '%s\n' sample/*"$name"*
If you want to loop over the matches, try
for file in sample/*"$name"*; do
# cope with nullglob
test -e "$file" || break
: do things with "$file"
done
If you just need to check that the name isn't a substring of an existing file's name:
valid=true
for file in sample/*"$name"*; do
test -e "$file" && valid=false
done
echo "$name is valid? $valid"
The shell by default does not expand a wildcard which doesn't match any files; so in this case, your loop will run once, but the loop variable will not match any existing file. You might also want to look at the nullglob option in Bash to make it loop zero times in this case.

Comparing files in the same directory with same name different extension [duplicate]

This question already has answers here:
Looping over pairs of values in bash [duplicate]
(6 answers)
Closed 6 years ago.
I have a bash script that looks through a directory and creates a .ppt from a .pdf, but i want to be able to check to see if there is a .pdf already for the .ppt because if there is I don't want to create one and if the .pdf is timestamped older then the .ppt I want to update it. I know for timestamp I can use (date -r bar +%s) but I cant seem how to figure out how to compare the files with the same name if they are in the same folder.
This is what I have:
#!/bin/env bash
#checks to see if argument is clean if so it deletes the .pdf and archive files
if [ "$1" = "clean" ]; then
rm -f *pdf
else
#reads the files that are PPT in the directory and copies them and changes the extension to .pdf
ls *.ppt|while read FILE
do
NEWFILE=$(echo $FILE|cut -d"." -f1)
echo $FILE": " $FILE " "$NEWFILE: " " $NEWFILE.pdf
cp $FILE $NEWFILE.pdf
done
fi
EDITS:
#!/bin/env bash
#checks to see if argument is clean if so it deletes the .pdf and archive files
if [ "$1" = "clean" ]; then
rm -f *pdf lectures.tar.gz
else
#reads the files that are in the directory and copies them and changes the extension to .pdf
for f in *.ppt
do
[ "$f" -nt "${f%ppt}pdf" ] &&
nf="${f%.*}"
echo $f": " $f " "$nf: " " $nf.pdf
cp $f $nf.pdf
done
To loop through all ppt files in the current directory and test to see if they are newer than the corresponding pdf and then do_something if they are:
for f in *.ppt
do
[ "$f" -nt "${f%ppt}pdf" ] && do_something
done
-nt is the bash test for one file being newer than another.
Notes:
Do not parse ls. The output from ls often contains a "displayable" form of the filename, not the actual filename.
The construct for f in *.ppt will work reliably all file names, even ones with tabs, or newlines in their names.
Avoid using all caps for shell variables. The system uses all caps for its variables and you do not want to accidentally overwrite one. Thus, use lower case or mixed case.
The shell has built-in capabilities for suffix removal. So, for example, newfile=$(echo $file |cut -d"." -f1) can be replaced with the much more efficient and more reliable form newfile="${file%%.*}". This is particularly important in the odd case that the file's name ends with a newline: command substitution removes all trailing newlines but the bash variable expansions don't.
Further, note that cut -d"." -f1 removes everything after the first period. If a file name has more than one period, this is likely not what you want. The form, ${file%.*}, with just one %, removes everything after the last period in the name. This is more likely what you want when you are trying to remove standard extensions like ppt.
Putting it all together
#!/bin/env bash
#checks to see if argument is clean if so it deletes the .pdf and archive files
if [ "$1" = "clean" ]; then
rm -f ./*pdf lectures.tar.gz
else
#reads the files that are in the directory and copies them and changes the extension to .pdf
for f in ./*.ppt
do
if [ "$f" -nt "${f%ppt}pdf" ]; then
nf="${f%.*}"
echo "$f: $f $nf: $nf.pdf"
cp "$f" "$nf.pdf"
fi
done
fi

Renaming batch files, but error occurs when there is space in the file name. Need a fix! Bash [duplicate]

This question already has an answer here:
Mac OS X - Passing pathname with spaces as arguments to bashscript and then issue open Terminal command
(1 answer)
Closed 8 years ago.
So my code works. It's doing what I want to. Essentially my script renames files to match the last two directories in which they are placed, followed by zero padding. Also it takes an argument where if you type in the directory it'll change the files in the specified directory.
Here's my code:
r="$#"
if [ -d "$r" ]; then # if string exists and is a directory then do the following commands
cd "$r" # change directory to the specified name
echo "$r" # print directory name
elif [ -z "$1" ]; then # if string argument is null then do following command
echo "Current Directory" # Print Current Directory
else # if string is not a directory or null then do nothing
echo "No such Directory" # print No such Directory
fi
e=`pwd | awk -F/ '{ print $(NF-1) "_" $NF }'` # print current directory | print only the last two fields
echo $e
X=1;
for i in `ls -1`; do # loop. rename all files in directory to "$e" with 4 zeroes padding.
mv $i $e.$(printf %04d.%s ${X%.*} ${i##*.}) # only .jpg files for now, but can be changed to all files.
let X="$X+1"
done
And here is the output:
Testdir_pics.0001.jpg
Testdir_pics.0002.jpg
...
However, just as the title suggests, it creates errors when the filenames have spaces in them. How do I fix this?
If there are spaces in the file names, then these two lines will fail:
for i in `ls -1`; do
mv $i $e.$(printf %04d.%s ${X%.*} ${i##*.})
Replace them with:
for i in *; do
mv "$i" "$e.$(printf %04d.%s "${X%.*}" "${i##*.}")"
Comments:
for i in * will work for all file names even those with the most difficult characters. By contrast, the for i in $(ls -1) formulation is very fragile.
Unless, for some strange reason, you really want word splitting to be performed on your variables, always place them in double-quotes. Thus, mv $i ... should be replaced with mv "$1" ....

Resources