Checking if substring is in filename in bash - bash

I'm trying to create a script that identifies the names of files in a directory and then checks to see if a string is a substring of the name. I'm doing this in bash and cannot use the grep command. Any thoughts?
I have the following code to check if a user submission matches a file name or a string in the name.
read -p name
for file in sample/*; do
echo $(basename "$file")
if [[$(basename "$file") ~= $name]];
then echo "invalid"
fi
done

You can just interpolate the user input into the wildcard.
printf '%s\n' sample/*"$name"*
If you want to loop over the matches, try
for file in sample/*"$name"*; do
# cope with nullglob
test -e "$file" || break
: do things with "$file"
done
If you just need to check that the name isn't a substring of an existing file's name:
valid=true
for file in sample/*"$name"*; do
test -e "$file" && valid=false
done
echo "$name is valid? $valid"
The shell by default does not expand a wildcard which doesn't match any files; so in this case, your loop will run once, but the loop variable will not match any existing file. You might also want to look at the nullglob option in Bash to make it loop zero times in this case.

Related

Bash script MV is disappearing files

I've written a script to go through all the files in the directory the script is located in, identify if a file name contains a certain string and then modify the filename. When I run this script, the files that are supposed to be modified are disappearing. It appears my usage of the mv command is incorrect and the files are likely going to an unknown directory.
#!/bin/bash
string_contains="dummy_axial_y_position"
string_dontwant="dummy_axial_y_position_time"
file_extension=".csv"
for FILE in *
do
if [[ "$FILE" == *"$string_contains"* ]];then
if [[ "$FILE" != *"$string_dontwant"* ]];then
filename= echo $FILE | head -c 15
combined_name="$filename$file_extension"
echo $combined_name
mv $FILE $combined_name
echo $FILE
fi
fi
done
I've done my best to go through the possible errors I've made in the MV command but I haven't had any success so far.
There are a couple of problems and several places where your script can be improved.
filename= echo $FILE | head -c 15
This pipeline runs echo $FILE adding the variable filename having the null string as value in its environment. This value of the variable is visible only to the echo command, the variable is not set in the current shell. echo does not care about it anyway.
You probably want to capture the output of echo $FILE | head -c 15 into the variable filename but this is not the way to do it.
You need to use command substitution for this purpose:
filename=$(echo $FILE | head -c 15)
head -c outputs only the first 15 characters of the input file (they can be on multiple lines but this does not happen here). head is not the most appropriate way for this. Use cut -c-15 instead.
But for what you need (extract the first 15 characters of the value stored in the variable $FILE), there is a much simpler way; use a form of parameter expansion called "substring expansion":
filename=${FILE:0:15}
mv $FILE $combined_name
Before running mv, the variables $FILE and $combined_name are expanded (it is called "parameter expansion"). This means that the variable are replaced by their values.
For example, if the value of FILE is abc def and the value of combined_name is mnp opq, the line above becomes:
mv abc def mnp opq
The mv command receives 4 arguments and it attempts to move the files denoted by the first three arguments into the directory denoted by the fourth argument (and it probably fails).
In order to keep the values of the variables as single words (if they contain spaces), always enclose them in double quotes. The correct command is:
mv "$FILE" "$combined_name"
This way, in the example above, the command becomes:
mv "abc def" "mnp opq"
... and mv is invoked with two arguments: abc def and mnp opq.
combined_name="$filename$file_extension"
There isn't any problem in this line. The quotes are simply not needed.
The variables filename and file_extension are expanded (replaced by their values) but on assignments word splitting is not applied. The value resulted after the replacement is the value assigned to variable combined_name, even if it contains spaces or other word separator characters (spaces, tabs, newlines).
The quotes are also not needed here because the values do not contain spaces or other characters that are special in the command line. They must be quoted if they contain such characters.
string_contains="dummy_axial_y_position"
string_dontwant="dummy_axial_y_position_time"
file_extension=".csv"
It is not not incorrect to quote the values though.
for FILE in *
do
if [[ "$FILE" == *"$string_contains"* ]];then
if [[ "$FILE" != *"$string_dontwant"* ]]; then
This is also not wrong but it is inefficient.
You can use the expression from the if condition directly in the for statement (and get rid of the if statement):
for FILE in *"$string_contains"*; do
if [[ "$FILE" != *"$string_dontwant"* ]]; then
...
If you have read and understood the above (and some of the linked documentation) you will be able to figure out yourself where were your files moved :-)

Loop through all the files with .txt extension in bash [duplicate]

This question already has answers here:
Loop through all the files with a specific extension
(7 answers)
Closed 4 years ago.
I am trying to loop over files in a folder and test for .txt extensions.
But I get the following error: "awk: cannot open = (No such file or directory)
Here's my code:
!/bin/bash
files=$(ls);
for file in $files
do
# extension=$($file | awk -F . '{ print $NF }');
if [ $file | awk -F . "{ print $NF }" = txt ]
then
echo $file;
else
echo "Not a .txt file";
fi;
done;
The way you are doing this is wrong in many ways.
You should never parse output of ls. It does not handle the filename containing special characters intuitively See Why you shouldn't parse the output of ls(1)
Don't use variables to store multi-line data. The output of ls in a variable is expected to undergo word splitting. In your case files is being referenced as a plain variable, and without a delimiter set, you can't go through the multiple files stored.
Using awk is absolutely unnecessary here, the part $file | awk -F . "{ print $NF }" = txt is totally wrong, you are not passing the name the file to the pipe, just the variable $file, it should have been echo "$file"
The right interpreter she-bang should have been set as #!/bin/bash in your script if you were planning to run it as an executable, i.e. ./script.sh. The more recommended way would be to say #!/usr/bin/env bash to let the shell identify the default version of the bash installed.
As such your requirement could be simply reduced to
for file in *.txt; do
[ -f "$file" ] || continue
echo "$file"
done
This is a simple example using a glob pattern using *.txt which does pathname expansion on the all the files ending with the txt format. Before the loop is processed, the glob is expanded as the list of files i.e. assuming the folder has files as 1.txt, 2.txt and foo.txt, the loop is generated to
for file in 1.txt 2.txt foo.txt; do
Even in the presence of no files, i.e. when the glob matches empty (no text files found), the condition [ -f "$file" ] || continue would ensure the loop is exit gracefully by checking if the glob returned any valid file results or just an un-expanded string. The condition [ -f "$file" ] would fail for everything if except a valid file argument.
Or if you are targeting scripts for bourne again shell, enable glob options to remove non-matching globs, rather than preserving them
shopt -s nullglob
for file in *.txt; do
echo "$file"
done
Another way using shell array to store the glob results and parse them over later to do a specific action on them. This way is useful when doing a list of files as an argument list to another command. Using a proper quoted expansion "${filesList[#]}" will preserve the spacing/tabs/newlines and other meta characters in filenames.
shopt -s nullglob
filesList=(*.txt)
for file in "${filesList[#]}"; do
echo "$file"
done

Bash script: A better way to remove a list of files and directories with wildcards

I am trying to pass a list of files including wildcard files and directories that I want to delete but check for them to see if they exist first before deleted. If they are deleted, notify that the directory was deleted, not each individual file within the directory. I.e. if I remove /root/*.tst just say, "I removed *.tst".
#!/bin/bash
touch /boot/{1,2,3,4,5,6,7,8,9,10}.tst
touch /root/{1,2,3,4,5,6,7,8,9,10}.tst
mkdir /root/tmpdir
#setup files and directories we want to delete
file=/boot/*.tst /root/*.tst /root/tmpdir
for i in $file; do
if [[ -d "$file" || -f "$file" ]]; then #do I exist
rm -fr $i
echo removed $i #should tell me that I removed /boot/*.tst or /root/*.tst or /root/tmpdir
else
echo $i does not exist # should tell me if /boot/*.tst or /root/*.tst or /root/tmpdir DNE
fi
done
I can't seem to make any combination of single or double quotes or escaping * make the above do what I want it to do.
Before explaining why your code fails, here is what you should use:
for i in /boot/*.txt /root/*.txt /root/tmpdir; do
# i will be a single file if a pattern expanded, or the literal
# pattern if it does not. You can check using this line:
[[ -f $i ]] || continue
# or use shopt -s nullglob before the loop to cause non-matching
# patterns to be silently ignored
rm -fr "$i"
echo removed $i
done
It appears that would want i to be set to each of three patterns, which is a little tricky and should probably be avoided, since most of the operators you are using expect single file or directory names, not patterns that match multiple names.
The attempt you show
file=/boot/*.tst /root/*.tst /root/tmpdir
would expand /root/*.tst and try to use the first name in the expansion as a command name, executed in an environment where the variable file had the literal value /boot/*.tst. To include all the patterns in the string, you would need to escape the spaces between them, with either
file=/boot/*.tst\ /root/*.tst\ /root/tmpdir
or more naturally
file="/boot/*.tst /root/*.tst /root/tmpdir"
Either way, the patterns are not yet expanded; the literal * is stored in the value of file. You would then expand it using
for i in $file # no quotes!
and after $file expands to its literal value, the stored patterns would be expanded into the set of matching file names. However, this loop would only work for file names that didn't contain whitespace; a single file named foo bar would be seen as two separate values to assign to i, namely foo and bar. The correct way to deal with such file names in bash is to use an array:
files=( /boot/*.tst /root/*.tst /root/tmpdir )
# Quote are necessary this time to protect space-containing filenames
# Unlike regular parameter assignment, the patterns were expanded to the matching
# set of file names first, then the resulting list of files was assigned to the array,
# one file name per element.
for i in "${files[#]}"
You can replace
file=/boot/*.tst /root/*.tst /root/tmpdir
by
printf -v file "%s " /boot/*.tst /root/*.tst /root/tmpdir
The shell expands globs automatically. If you want to be able to print the literal globs in an error message then you'll need to quote them.
rmglob() {
local glob
for glob in "$#"; do
local matched=false
for path in $glob; do
[[ -e $path ]] && rm -rf "$path" && matched=true
done
$matched && echo "removed $glob" || echo "$glob does not exist" >&2
done
}
rmglob '/boot/*.tst' '/root/*.tst' '/root/tmpdir'
Notice the careful use of quoting. The arguments to deleteGlobs are quoted. The $glob variable inside the function is not quoted (for path in $glob) which triggers shell expansion at that point.
Many thanks to everyones' posts including John Kugelman.
This is the code I finally went with that provided two types of deleting. The first is a bit more forceful deleting everything. The second preserved directory structures just removing the files. As per above, note that whitespace in file names is not handled by this method.
rmfunc() {
local glob
for glob in "$#"; do
local matched=false
local checked=true
for path in $glob; do
$checked && echo -e "\nAttempting to clean $glob" && checked=false
[[ -e $path ]] && rm -fr "$path" && matched=true
done
$matched && echo -e "\n\e[1;33m[\e[0m\e[1;32mPASS\e[1;33m]\e[0m Cleaned $glob" || echo -e "\n\e[1;33m[\e[0m\e[1;31mERROR\e[1;33m]\e[0m Can't find $glob (non fatal)."
done
}
# Type 2 removal
xargfunc() {
local glob
for glob in "$#"; do
local matched=false
local checked=true
for path in $glob; do
$checked && echo -e "\nAttempting to clean $glob" && checked=false
[[ -n $(find $path -type f) ]] && find $path -type f | xargs rm -f && matched=true
done
$matched && echo -e "\n\e[1;33m[\e[0m\e[1;32mPASS\e[1;33m]\e[0m Cleaned $glob" || echo -e "\n\e[1;33m[\e[0m\e[1;31mERROR\e[1;33m]\e[0m Can't find $glob (non fatal)."
fi
}

Using Variables with grep, and an IF statement regarding this

I am looking to search for strings within a file using variables.
I have a script that will accept 3 or 4 parameters: 3 are required; the 4th isn't mandatory.
I would like to search the text file for the 3 parameters matching within the same line, and if they do match then I want to remove that line and replace it with my new one - basically it would update the 4th parameter if set, and avoid duplicate entries.
Currently this is what I have:
input=$(egrep -e '$domain\s+$type\s+$item' ~/etc/security/limits.conf)
if [ "$input" == "" ]; then
echo $domain $type $item $value >>~/etc/security/limits.conf
echo \"$domain\" \"$type\" \"$item\" \"$value\" has been successfully added to your limits.conf file.
else
cat ~/etc/security/limits.conf | egrep -v "$domain|$type|$item" >~/etc/security/limits.conf1
rm -rf ~/etc/security/limits.conf
mv ~/etc/security/limits.conf1 ~/etc/security/limits.conf
echo $domain $type $item $value >>~/etc/security/limits.conf
echo \"$domain\" \"$type\" \"$item\" \"$value\" has been successfully added to your limits.conf file.
exit 0
fi
Now I already know that the input=egrep etc.. will not work; it works if I hard code some values, but it won't accept those variables. Basically I have domain=$1, type=$2 and so on.
I would like it so that if all 3 variables are not matched within one line, than it will just append the parameters to the end of the file, but if the parameters do match, then I want them to be deleted, and appended to the file. I know I can use other things like sed and awk, but I have yet to learn them.
This is for a school assignment, and all help is very much appreciated, but I'd also like to learn why and how it works/doesn't, so if you can provide answers to that as well that would be great!
Three things:
To assign the output of a command, use var=$(cmd).
Don't put spaces around the = in assignments.
Expressions don't expand in single quotes: use double quotes.
To summarize:
input=$(egrep -e "$domain\s+$type\s+$item" ~/etc/security/limits.conf)
Also note that ~ is your home directory, so if you meant /etc/security/limits.conf and not /home/youruser/etc/security/limits.conf, leave off the ~
You have several bugs in your script. Here's your script with some comments added
input=$(egrep -e '$domain\s+$type\s+$item' ~/etc/security/limits.conf)
# use " not ' in the string above or the shell can't expand your variables.
# some versions of egrep won't understand '\s'. The safer, POSIX character class is [[:blank:]].
if [ "$input" == "" ]; then
# the shell equality test operator is =, not ==. Some shells will also take == but don't count on it.
# the normal way to check for a variable being empty in shell is with `-z`
# you can have problems with tests in some shells if $input is empty, in which case you'd use [ "X$input" = "X" ].
echo $domain $type $item $value >>~/etc/security/limits.conf
# echo is unsafe and non-portable, you should use printf instead.
# the above calls echo with 4 args, one for each variable - you probably don't want that and should have double-quoted the whole thing.
# always double-quote your shell variables to avoid word splitting ad file name expansion (google those - you don't want them happening here!)
echo \"$domain\" \"$type\" \"$item\" \"$value\" has been successfully added to your limits.conf file.
# the correct form would be:
# printf '"%s" "%s" "%s" "%s" has been successfully added to your limits.conf file.\n' "$domain" "$type" "$item" "$value"
else
cat ~/etc/security/limits.conf | egrep -v "$domain|$type|$item" >~/etc/security/limits.conf1
# Useless Use Of Cat (UUOC - google it). [e]grep can open files just as easily as cat can.
rm -rf ~/etc/security/limits.conf
# -r is for recursively removing files in a directory - inappropriate and misleading when used on a single file.
mv ~/etc/security/limits.conf1 ~/etc/security/limits.conf
# pointless to remove the file above when you're overwriting it here anyway
# If your egrep above failed to create your temp file (e.g. due to memory or permissions issues) then the "mv" above would zap your real file. the correct way to do this is:
# egrep regexp file > tmp && mv tmp file
# i.e. use && to only do the mv if creating the tmp file succeeded.
echo $domain $type $item $value >>~/etc/security/limits.conf
# see previous echo comments.
echo \"$domain\" \"$type\" \"$item\" \"$value\" has been successfully added to your limits.conf file.
# ditto
exit 0
# pointless and misleading having an explicit "exit <success>" when that's what the script will do by default anyway.
fi
This line:
input=$(egrep -e '$domain\s+$type\s+$item' ~/etc/security/limits.conf)
requires double quotes around the regex to allow the shell to interpolate the variable values.
input=$(egrep -e "$domain\s+$type\s+$item" ~/etc/security/limits.conf)
You need to be careful with backslashes; you probably don't have to double them up in this context, but you should be sure you know why.
You should be aware that your first egrep commands is much more restrictive in what it selects than the second egrep which is used to delete data from the file. The first requires the entry with the three fields in the single line; the second only requires a match with any one of the words (and that could be part of a larger word) to delete the line.
Since ~/etc/security/limits.conf is a file, there is no need to use the -r option of rm; it is advisable not to use the -r unless you intend to remove directories.

Basename puts single quotes around variable

I am writing a simple shell script to make automated backups, and I am trying to use basename to create a list of directories and them parse this list to get the first and the last directory from the list.
The problem is: when I use basename in the terminal, all goes fine and it gives me the list exactly as I want it. For example:
basename -a /var/*/
gives me a list of all the directories inside /var without the / in the end of the name, one per line.
BUT, when I use it inside a script and pass a variable to basename, it puts single quotes around the variable:
while read line; do
dir_name=$(echo $line)
basename -a $dir_name/*/ > dir_list.tmp
done < file_with_list.txt
When running with +x:
+ basename -a '/Volumes/OUTROS/backup/test/*/'
and, therefore, the result is not what I need.
Now, I know there must be a thousand ways to go around the basename problem, but then I'd learn nothing, right? ;)
How to get rid of the single quotes?
And if my directory name has spaces in it?
If your directory name could include spaces, you need to quote the value of dir_name (which is a good idea for any variable expansion, whether you expect spaces or not).
while read line; do
dir_name=$line
basename -a "$dir_name"/*/ > dir_list.tmp
done < file_with_list.txt
(As jordanm points out, you don't need to quote the RHS of a variable assignment.)
Assuming your goal is to populate dir_list.tmp with a list of directories found under each directory listed in file_with_list.txt, this might do.
#!/bin/bash
inputfile=file_with_list.txt
outputfile=dir_list.tmp
rm -f "$outputfile" # the -f makes rm fail silently if file does not exist
while read line; do
# basic syntax checking
if [[ ! ${line} =~ ^/[a-z][a-z0-9/-]*$ ]]; then
continue
fi
# collect targets using globbing
for target in "$line"/*; do
if [[ -d "$target" ]]; then
printf "%s\n" "$target" >> $outputfile
fi
done
done < $inputfile
As you develop whatever tool will process your dir_list.tmp file, be careful of special characters (including spaces) in that file.
Note that I'm using printf instead of echo so that targets whose first character is a hyphen won't cause errors.
This might work
while read; do
find "$REPLY" >> dir_list.tmp
done < file_with_list.txt

Resources