Shell script - loop through file names and extract numbers in the filename - shell

Using a linux shell script, I am trying to loop through all of the file names in a directory and extract the numbers out of the file name before I process the file.
Something like this:
for files in `ls *.gz`
do
echo "Looking at... $files"
gunzip $files
echo "$files" | awk '/[0-9]/' ' {print $1}'
echo "$files is unzipped"
done
Thank you for any help with this.

You can substitute all non-numbers in the file name.
echo "${files//[!0-9]/}"
This will obviously produce concatenated numbers if a file name contains multiple runs of digits. For example, 12a34.gz gets turned into 1234.
This substitution mechanism is a Bash-only feature, and not supported by plain sh.

Related

Bash File names will not append to file from script

Hello I am trying to get all files with Jane's name to a separate file called oldFiles.txt. In a directory called "data" I am reading from a list of file names from a file called list.txt, from which I put all the file names containing the name Jane into the files variable. Then I'm trying to test the files variable with the files in list.txt to ensure they are in the file system, then append the all the files containing jane to the oldFiles.txt file(which will be in the scripts directory), after it tests to make sure the item within the files variable passes.
#!/bin/bash
> oldFiles.txt
files= grep " jane " ../data/list.txt | cut -d' ' -f 3
if test -e ~data/$files; then
for file in $files; do
if test -e ~/scripts/$file; then
echo $file>> oldFiles.txt
else
echo "no files"
fi
done
fi
The above code gets the desired files and displays them correctly, as well as creates the oldFiles.txt file, but when I open the file after running the script I find that nothing was appended to the file. I tried changing the file assignment to a pointer instead files= grep " jane " ../data/list.txt | cut -d' ' -f 3 ---> files=$(grep " jane " ../data/list.txt) to see if that would help by just capturing raw data to write to file, but then the error comes up "too many arguments on line 5" which is the 1st if test statement. The only way I get the script to work semi-properly is when I do ./findJane.sh > oldFiles.txt on the shell command line, which is me essentially manually creating the file. How would I go about this so that I create oldFiles.txt and append to the oldFiles.txt all within the script?
The biggest problem you have is matching names like "jane" or "Jane's", etc. while not matching "Janes". grep provides the options -i (case insensitive match) and -w (whole-word match) which can tailor your search to what you appear to want without having to use the kludge (" jane ") of appending spaces before an after your search term. (to properly do that you would use [[:space:]]jane[[:space:]])
You also have the problem of what is your "script dir" if you call your script from a directory other than the one containing your script, such as calling your script from your $HOME directory with bash script/findJane.sh. In that case your script will attempt to append to $HOME/oldFiles.txt. The positional parameter $0 always contains the full pathname to the current script being run, so you can capture the script directory no matter where you call the script from with:
dirname "$0"
You are using bash, so store all the filenames resulting from your grep command in an array, not some general variable (especially since your use of " jane " suggests that your filenames contain whitespace)
You can make your script much more flexible if you take the information of your input file (e.g list.txt), the term to search for (e.g. "jane"), the location where to check for existence of the files (e.g. $HOME/data) and the output filename to append the names to (e.g. "oldFile.txt") as command line [positonal] parameters. You can give each default values so it behaves as you currently desire without providing any arguments.
Even with the additional scripting flexibility of taking the command line arguments, the script actually has fewer lines simply filling an array using mapfile (synonymous with readarray) and then looping over the contents of the array. You also avoid the additional subshell for dirname with a simple parameter expansion and test whether the path component is empty -- to replace with '.', up to you.
If I've understood your goal correctly, you can put all the pieces together with:
#!/bin/bash
# positional parameters
src="${1:-../data/list.txt}" # 1st param - input (default: ../data/list.txt)
term="${2:-jane}" # 2nd param - search term (default: jane)
data="${3:-$HOME/data}" # 3rd param - file location (defaut: ../data)
outfn="${4:-oldFiles.txt}" # 4th param - output (default: oldFiles.txt)
# save the path to the current script in script
script="$(dirname "$0")"
# if outfn not given, prepend path to script to outfn to output
# in script directory (if script called from elsewhere)
[ -z "$4" ] && outfn="$script/$outfn"
# split names w/term into array
# using the -iw option for case-insensitive whole-word match
mapfile -t files < <(grep -iw "$term" "$src" | cut -d' ' -f 3)
# loop over files array
for ((i=0; i<${#files[#]}; i++)); do
# test existence of file in data directory, redirect name to outfn
[ -e "$data/${files[i]}" ] && printf "%s\n" "${files[i]}" >> "$outfn"
done
(note: test expression and [ expression ] are synonymous, use what you like, though you may find [ expression ] a bit more readable)
(further note: "Janes" being plural is not considered the same as the singular -- adjust the grep expression as desired)
Example Use/Output
As was pointed out in the comment, without a sample of your input file, we cannot provide an exact test to confirm your desired behavior.
Let me know if you have questions.
As far as I can tell, this is what you're going for. This is totally a community effort based on the comments, catching your bugs. Obviously credit to Mark and Jetchisel for finding most of the issues. Notable changes:
Fixed $files to use command substitution
Fixed path to data/$file, assuming you have a directory at ~/data full of files
Fixed the test to not test for a string of files, but just the single file (also using -f to make sure it's a regular file)
Using double brackets — you could also use double quotes instead, but you explicitly have a Bash shebang so there's no harm in using Bash syntax
Adding a second message about not matching files, because there are two possible cases there; you may need to adapt depending on the output you're looking for
Removed the initial empty redirection — if you need to ensure that the file is clear before the rest of the script, then it should be added back, but if not, it's not doing any useful work
Changed the shebang to make sure you're using the user's preferred Bash, and added set -e because you should always add set -e
#!/usr/bin/env bash
set -e
files=$(grep " jane " ../data/list.txt | cut -d' ' -f 3)
for file in $files; do
if [[ -f $HOME/data/$file ]]; then
if [[ -f $HOME/scripts/$file ]]; then
echo "$file" >> oldFiles.txt
else
echo "no matching file"
fi
else
echo "no files"
fi
done

How to expand macros in strings read from a file in a ksh script?

I want to read a list of file names stored in a file, and the top level directory is a macro, since this is for a script that may be run in several environments.
For example, there is a file file_list.txt holding the following fully qualified file paths:
$TOP_DIR/subdir_a/subdir_b/file_1
$TOP_DIR/subdir_x/subdir_y/subdir_z/file_2
In my script, I want to tar the files, but in order to do that, tar must know the actual path.
How can I get the string containing the file path to expand the macro to get the actual path?
In the code below the string value echoed is exactly as in the file above.
I tried using actual_file_path=`eval $file_path` and while eval does evaluate the macro, it returns a status, not the evaluated path.
for file_path in `cat $input_file_list`
do
echo "$file_path"
done
With the tag ksh I think you do not have the utility envsubst.
When the number of variables in $input_file_list is very limited, you can substitute vars with awk :
awk -v top_dir="${TOP_DIR}" '{ sub(/$TOP_DIR/, top_dir); print}' "${input_file_list}"
I was using eval incorrectly. The solution is to use an assignment on the right side of eval as follows:
for file_path in `cat $input_file_list`
do
eval myfile=$file_name
echo "myfile = $myfile"
done
$myfile now has the actual expansion of the macro.

Create multiple empty files whose names will be obtained from the file

I want to create empty files where the names are taken from list.txt. I tried this:
declare -a siglist=$(cat list.txt)
for fname in {list}; do
echo $fname > "$fname.zz"
done
Note that your echo solution does not create empty files: they contain a single newline.
If you want to stay entirely in the shell (the comments show various ways using touch and xargs), you can use the null-command : (which does nothing) with a redirection as follows.
while read filename; do
: > "${filename}.zz"
done < list.txt
This is portable to all Bourne-heritage shells, so not restricted to bash.
If you are ok with awk, could you please try following.
awk '{val=(val?val OFS:"")$0} END{system("touch " val)}' Input_file
Explanation: Creating a variable named val in awk program and keep concatenating line's values into it. In END section of this awk program using system command of this command to use touch command and passing that variable val to it to create all file names there.

Loop through all the files with .txt extension in bash [duplicate]

This question already has answers here:
Loop through all the files with a specific extension
(7 answers)
Closed 4 years ago.
I am trying to loop over files in a folder and test for .txt extensions.
But I get the following error: "awk: cannot open = (No such file or directory)
Here's my code:
!/bin/bash
files=$(ls);
for file in $files
do
# extension=$($file | awk -F . '{ print $NF }');
if [ $file | awk -F . "{ print $NF }" = txt ]
then
echo $file;
else
echo "Not a .txt file";
fi;
done;
The way you are doing this is wrong in many ways.
You should never parse output of ls. It does not handle the filename containing special characters intuitively See Why you shouldn't parse the output of ls(1)
Don't use variables to store multi-line data. The output of ls in a variable is expected to undergo word splitting. In your case files is being referenced as a plain variable, and without a delimiter set, you can't go through the multiple files stored.
Using awk is absolutely unnecessary here, the part $file | awk -F . "{ print $NF }" = txt is totally wrong, you are not passing the name the file to the pipe, just the variable $file, it should have been echo "$file"
The right interpreter she-bang should have been set as #!/bin/bash in your script if you were planning to run it as an executable, i.e. ./script.sh. The more recommended way would be to say #!/usr/bin/env bash to let the shell identify the default version of the bash installed.
As such your requirement could be simply reduced to
for file in *.txt; do
[ -f "$file" ] || continue
echo "$file"
done
This is a simple example using a glob pattern using *.txt which does pathname expansion on the all the files ending with the txt format. Before the loop is processed, the glob is expanded as the list of files i.e. assuming the folder has files as 1.txt, 2.txt and foo.txt, the loop is generated to
for file in 1.txt 2.txt foo.txt; do
Even in the presence of no files, i.e. when the glob matches empty (no text files found), the condition [ -f "$file" ] || continue would ensure the loop is exit gracefully by checking if the glob returned any valid file results or just an un-expanded string. The condition [ -f "$file" ] would fail for everything if except a valid file argument.
Or if you are targeting scripts for bourne again shell, enable glob options to remove non-matching globs, rather than preserving them
shopt -s nullglob
for file in *.txt; do
echo "$file"
done
Another way using shell array to store the glob results and parse them over later to do a specific action on them. This way is useful when doing a list of files as an argument list to another command. Using a proper quoted expansion "${filesList[#]}" will preserve the spacing/tabs/newlines and other meta characters in filenames.
shopt -s nullglob
filesList=(*.txt)
for file in "${filesList[#]}"; do
echo "$file"
done

Bulk Renaming Files isnt working for me

I am running a shell script on my mac, and i am getting a "No Such file or directory.
The input is: the replacement_name, and the working dir.
The output is: changing all files in the directory from $file to $newfilename
#!/bin/sh
echo "-------------------------------------------------"
echo "Arguments:"
echo "Old File String: $1"
echo "New File Name Head: $2"
echo "Directory to Change: $3"
echo "-------------------------------------------------"
oldname="$1"
newname="$2"
abspath="$3"
echo "Updating all files in '$abspath' to $newname.{extension}"
for file in $(ls $abspath);
do
echo $file
echo $file | sed -e "s/$oldname/$newname/g"
newfilename=$("echo $file| sed -e \"s/$oldname/$newname/g\"")
echo "NEW FILE: $newfilename"
mv $abspath/$file $abspath/$newfilename
done
It seems that it doesnt like assigning the result of my 1-liner to a variable.
old_filename_string_template.dart
test_template.dart
./bulk_rename.sh: line 16: echo old_filename_string.dart| sed -e "s/old_filename_string/test/g": No such file or directory
NEW FILE:
Test Information:
mkdir /_temp_folder
touch old_filename_string_template.a old_filename_string_template.b old_filename_string_template.c old_filename_string1_template.a old_filename_string1_template.b old_filename_string1_template.c old_filename_string3_template.a old_filename_string3_template.b old_filename_string3_template.c
./convert.sh old_filename_string helloworld /_temp_folder
The double quotes here make the shell look for a command whose name (filename, alias, or function name) is the entire string between the quotes. Obviously, no such command exists.
> newfilename=$("echo $file| sed -e \"s/old_filename_string/$1/g\"")
Removing the double quotes inside the parentheses and the backslashes before the remaining ones will fix this particular error.
The construct $(command sequence) is called a command substitution; the shell effectively replaces this string with the standard output obtained by evaluating command sequence in a subshell.
Most of the rest of your script has much too few quotes; so it's really unclear why you added them here in particular. http://shellcheck.net/ is a useful service which will point out a few dozen more trivial errors. Briefly, anything which contains a file name should be between double quotes.
Try to put double quotes outside backticks/subtitutions (not INSIDE backticks/substitutions like $("..."))
newfilename="$(....)"
By the way, please consider to use the package perl rename which already does this bulk file rename very well, with Perl regex style which is easier to use. This perl rename command maybe alreay available in your (Mac) distro. See intro pages.

Resources