Find file names in other Bash files using grep - bash

How do I loop through a list of Bash file names from an input text file and grep each file in a directory for each file name (to see if the file name is contained in the file) and output to text all file names that weren't found in any files?
#!/bin/sh
# This script will be used to output any unreferenced bash files
# included in the WebAMS Project
# Read file path of bash files and file name input
SEARCH_DIR=$(awk -F "=" '/Bash Dir/ {print $2}' bash_input.txt)
FILE_NAME=$(awk -F "=" '/Input File/ {print $2}' bash_input.txt)
echo $SEARCH_DIR
echo $FILE_NAME
exec<$FILE_NAME
while read line
do
echo "IN WHILE"
if (-z "$(grep -lr $line $SEARCH_DIR)"); then
echo "ENTERED"
echo $filename
fi
done

Save this as search.sh, updating SEARCH_DIR as appropriate for your environment:
#!/bin/bash
SEARCH_DIR=some/dir/here
while read filename
do
if [ -z "$(grep -lr $filename $SEARCH_DIR)" ]
then
echo $filename
fi
done
Then:
chmod +x search.sh
./search.sh files-i-could-not-find.txt

It could be possible through grep and find commands,
while read -r line; do (find . -type f -exec grep -l "$line" {} \;); done < file
OR
while read -r line; do grep -rl "$line"; done < file
-r --> recursive
-l --> files-with-matches(Displays the filenames which contains the search string)
It will read all the filenames present inside the input file and search for the filenames which contains the readed filenames. If it found any, then it returns the corresponding filename.

You're using regular parentheses instead of square brackets in your if statement.
The square brackets are a test command. You're running a test (in your case, whether a string has zero length or not. If the test is successful, the [ ... ] command returns an exit code of zero. The if statement sees that exit code and runs the then clause of the if statement. Otherwise, if an else statement exists, that is run instead.
Because the [ .. ] are actually commands, you must leave a blank space around each side.
Right
if [ -z "$string" ]
Wrong
if [-z "$string"] # Need white space around the brackets
Sort of wrong
if [ -z $sting ] # Won't work if "$string" is empty or contains spaces
By the way, the following are the same:
if test -z "$string"
if [ test -z "$string" ]
Be careful with that grep command. If there are spaces or newlines in the string returned, it may not do what you think it does.

Related

Counting number of lines in file and saving it in a bash file

I am trying to loop through all the files in a folder and add the file name of those files with 10 lines to a txt file but I don't know how to write the if statement.
As of right now, what I have is:
for FILE in *.txt do if wc $FILE == 10; then "$FILE" >> saved_names.txt fi done
I am getting stuck in how to format the statement that will evaluate to a boolean for the if statement.
I have already tried the if statement as:
if [ wc $FILE != 10 ]
if "wc $FILE" != 10
if "wc $FILE != 10"
as well as other ways but I don't seem to get it right. I know I am new to Bash but I can't seem to find a solution to this question.
There are a few problems in your code.
To count the number of lines in the file you should run "wc -l" command. However, that command will result in the number of lines and the name of the file (so for example - 10 a.txt - you can test it by running the command on a file in your terminal). To receive only the number of lines you need to pass the file's name to the standard input of that command
"==" is used in bash to compare strings. To compare integers as in that case, you should use "-eq" (take a look here https://tldp.org/LDP/abs/html/comparison-ops.html)
In terms of brackets: To get the wc command result you need to run it in a terminal and switch the command in the code to the result. To do that, you need correct brackets - $(wc -l). To receive a result of the comparison as a bool, you need to use square brackets with spaces [ 1 -eq 1 ].
To save the name of the file in another file using >> you need to first put the name to the standard output (as >> redirect the standard output to the chosen place). To do that you can just use the echo command.
The code should look like this:
#!/bin/bash
for FILE in *.txt
do
if [ "$(wc -l < "$FILE")" -eq 10 ]
then
echo "$FILE" >> saved_names.txt
fi
done
Try:
for file in *.txt; do
if [[ $(wc -l < "$file") -eq 10 ]]; then
printf '%s\n' "$file"
fi
done > saved_names.txt
Change > to >> if you want to append the filenames.
Related docs:
Command Substitution
Conditional Constructs
Extract the actual number of lines from a file with wc -l $FILE | cut -f1 -d' ' and use -eq operator:
for FILE in *.txt; do if [ "$(wc -l $FILE | cut -f1 -d' ')" -eq 10 ]; then "$FILE" >> saved_names.txt; fi; done

Bash (split) file name comparison fails

In my directory I have files (*fastq.gz.fasta) and directories, whose names contain the filenames (*fastq.gz.fasta-blastdb):
IVC6_Meino.clust.gz.fasta-blastdb
IVC5_Mehiv.clust.gz.fasta-blastdb
....
IVC6_Meino.clust.gz.fasta
IVC5_Mehiv.clust.gz.fasta
....
In a bash script I want to compare the filenames with the direcories using the cut option on the latter to extract only the filename part. If those two names match I want to do further stuff (for now echo match or no match respectively).
I have written the following piece of code:
#!/bin/bash
for file in *.fasta
do
for db in *-blastdb
do
echo $file, $db | cut -d '-' -f 1
if [[ $file = "$db | cut -d '-' -f 1" ]]; then
echo "match"
else
echo "no match"
fi
done
done
But it does not detect matches. The output looks like this:
...
IVC6_Meino.clust.gz.fasta, IIIA11_Meova.clust.gz.fasta
no match
IVC6_Meino.clust.gz.fasta, IVC5_Mehiv.clust.gz.fasta
no match
IVC6_Meino.clust.gz.fasta, IVC6_Meino.clust.gz.fasta
no match
The last line should read match as you can see, the strings look the same.
What am i missing?
You can use parameter expansion to do this more easily:
for file in *.fasta
do
for db in *-blastdb
do
echo "$file", "$db"
if [[ "${file%%.fasta}" = "${db%%.fasta-blastdb}" ]]; then
echo "match"
else
echo "no match"
fi
done
done
If you want to fix yours, the problem is the use of $db | cut -d '-' -f 1 With echo it appears that echo is printing the pipe. It isn't. cut is printing. When you do [[ $file = "$db | cut -d '-' -f 1" ]] it is equivalent to [[ $file = [return code from last pipe component] ]]
You need to use the $(..) shell construct to capture the output of the pipe and you need to echo to get the contents of $db to start the pipe. You should quote "$db" so you do not have word splitting or globbing from the contents of the variable.
Like so:
for file in *.fasta
do
for db in *-blastdb
do
ts=$(echo "$db" | cut -d '-' -f 1)
echo "$file", "$ts"
if [[ "$file" = "$ts" ]]; then
echo "match"
else
echo "no match"
fi
done
done # this works I think -- not tested...
Please be careful with your quoting with Bash and liberally use ShellCheck.
The structure you have is also not the most efficient. You will loop over the *-blastdb glob once for every file in *-blastdb. If you have a lot of files, that could get really slow.
To solve that, you could rewrite this loop with Bash arrays (best if you have Bash 4+) or use awk:
ext1=.fasta
ext2=.fasta-blastdb
awk 'FNR==NR{
s=$0
sub("\\"ext1"$","",s)
seen[s]=$0
next}
{
s=$0
sub("\\"ext2"$","",s)
if (s in seen)
print seen[s], $0
}
' ext1="$ext1" ext2="$ext2" <(for fn in *$ext1; do echo "$fn"; done) <(for fn in *$ext2; do echo "$fn"; done)
Each glob is only executing once and awk is using an array to test if the basenames are the same.
Best

How to read multiple lines in while statement in ksh

I am creating a script to help me through my daily work and automate it. I have encountered my problem when trying to input multiple lines in my while loop. I usually do it in my for loop but I execute it via command.
Sample:
for i in `cat listoffiles.txt`
do
echo $i
find <path> -name *$i* | awk -F "." {'print $4'} #to display a specific value
done
Now I am trying to automate it with a while loop. Having problems to read multiple input lines in it.
For example:
i want to search for these inputs:
For
Example
only
here is my script for it:
#!/bin/ksh
echo Please enter file #:
read Var1
while true
do
VarSession=`find $OT_DIR/archive*/ -name *$Var1* | awk -F "." {'print $4'}`
if [ "$VarSession" = "" ]
then
echo No match for File# $Var1 on this leg or is out of retention.
else
echo File# $Var1 is under Session# $VarSession
fi
done
VarSession=`find $OT_DIR/archive*/ -name *$Var1* | awk -F "." {'print $4'}`
Assuming that you provide 1 2 3 as input, The line above translates to this
VarSession=`find $OT_DIR/archive*/ -name "1 2 3" | awk -F "." {'print $4'}`
But you want to search all those values separately so you need another loop. for loop serves the purpose if traversing white-space separated entries.
Also, based upon the original script that you showed, I assume you want the script to search file by file, rather than scanning entire directories. However, the statement above will put all output in the variable without traversing it. To traverse line by line, while loop does the job.
#!/bin/ksh
# -n switch suppresses printing a newline
echo -n 'Please enter file #: '
read Var1
# Traverse over all entered values in Var1 (separated by white space)
for i in $Var1
do
#Set a flag to zero, logic explained later
Flag=0
find $OT_DIR/archive*/ -name *$i* | while read FileName
do
#Set the Flag to 1 if find command finds something
Flag=1
VarSession=`echo $FileName | awk -F "." {'print $4'}`
if [ "$VarSession" = "" ]
then
#If find found a file but VarSession has nothing then file name is not correct
echo "Some conventions went wrong in file name: $FileName"
else
echo "File# $Var1 is under Session# $VarSession"
fi
done
#If find found nothing, there was no match
if [ $Flag -eq 0 ]
then
echo No match for File# $Var1 on this leg or is out of retention.
fi
done

Bourne Shell doesn't find unix commands on script

#!/bin/sh
echo "Insert the directory you want to detail"
read DIR
#Get the files:
FILES=`ls "$DIR" | sort`
echo "Files in the list:"
echo "$FILES"
echo ""
echo "Separating directories from files..."
for FILE in $FILES
do
PATH=${DIR}"/$FILE"
OUTPUT="Path: $PATH"
if [ -f "$PATH" ]; then
NAME=`echo "$FILE" | cut -d'.' -f1`
OUTPUT=${OUTPUT}" (filename: $NAME"
EXTENSION=`echo "$FILE" | cut -s -d'.' -f2`
if [ ${#EXTENSION} -gt 0 ]; then
OUTPUT=${OUTPUT}" - type: $EXTENSION)"
else
OUTPUT=${OUTPUT}")"
fi
elif [ -d "$PATH" ]; then
OUTPUT=${OUTPUT}" (dir name: $FILE)"
fi
echo "$OUTPUT"
done
I get this output when running it (I ran using relative path and full path)
$ ./problem.sh
Insert the directory you want to detail
.
Files in the list:
directoryExample
problem.sh
Separating directories from files...
Path: ./directoryExample (dir name: directoryExample)
./problem.sh: cut: not found
./problem.sh: cut: not found
Path: ./problem.sh (filename: )
$
$
$ ./problem.sh
Insert the directory you want to detail
/home/geppetto/problem
Files in the list:
directoryExample
problem.sh
Separating directories from files...
Path: /home/geppetto/problem/directoryExample (dir name: directoryExample)
./problem.sh: cut: not found
./problem.sh: cut: not found
Path: /home/geppetto/problem/problem.sh (filename: )
$
As you can see I received "cut: not found" two times when arranging the output string of file types. why? (I am using Free BSD)
PATH is the variable used by the shell to store the list of directories where commands like cut might be found. You overwrote the value of that variable, losing the initial list. The easy fix is to not use PATH in your for loop. The more complete answer is to avoid all variable names consisting of only uppercase letters, as those are reserved for use by the shell. Include as least one lowercase letter or number in all your own variable names to avoid interfering with current (or future) variables used by the shell.

How do I use Bash to create a copy of a file with an extra suffix before the extension?

This title is a little confusing, so let me break it down. Basically I have a full directory of files with various names and extensions:
MainDirectory/
image_1.png
foobar.jpeg
myFile.txt
For an iPad app, I need to create copies of these with the suffix #2X appended to the end of all of these file names, before the extension - so I would end up with this:
MainDirectory/
image_1.png
image_1#2X.png
foobar.jpeg
foobar#2X.jpeg
myFile.txt
myFile#2X.txt
Instead of changing the file names one at a time by hand, I want to create a script to take care of it for me. I currently have the following, but it does not work as expected:
#!/bin/bash
FILE_DIR=.
#if there is an argument, use that as the files directory. Otherwise, use .
if [ $# -eq 1 ]
then
$FILE_DIR=$1
fi
for f in $FILE_DIR/*
do
echo "Processing $f"
filename=$(basename "$fullfile")
extension="${filename##*.}"
filename="${filename%.*}"
newFileName=$(echo -n $filename; echo -n -#2X; echo -n $extension)
echo Creating $newFileName
cp $f newFileName
done
exit 0
I also want to keep this to pure bash, and not rely on os-specific calls. What am I doing wrong? What can I change or what code will work, in order to do what I need?
#!/bin/sh -e
cd "${1-.}"
for f in *; do
cp "$f" "${f%.*}#2x.${f##*.}"
done
It's very easy to do that with awk in one line like this:
ls -1 | awk -F "." ' { print "cp " $0 " " $1 "#2X." $2 }' | sh
with ls -1 you get just the bare list of files, then you pipe awk to use the dot (.) as separator. Then you build a shell command to create a copy of each file.
I suggest to run the command without the last sh pipe before, in order to check the cp commands are correct. Like this:
ls -1 | awk -F "." ' { print "cp " $0 " " $1 "#2X." $2 }'

Resources