What does each line of this bash script do?

What does each line of this bash script do? - bash

I found an old past paper question with content not covered in my course. I hope I don't get examined with that but what does this bash script do? I know grep takes user input and outputs the line containing the input and echo just repeats the input and cat just displays the input. But I have on idea what this does as a whole. Any help please?
#!/bin/bash
outputFile=$1
for file in $(find -name '*txt' | grep data)
do
echo $file >> $outputFile
cat $file >> $outputFile
done

Each line:
#!/bin/bash
Hash-bang the script to use bash
outputFile=$1
Set the variable named "outputFile" to the first parameter passed into the script. Running the script would look like bash myScript.sh "/some/file/to/output.txt"
for file in $(find -name '*txt' | grep data)
do
Loop through every file in this directory and it's subdirectories looking for a file that ends in with the characters "txt" and contains the characters "data" somewhere in the name. For each iteration of the for loop/file found, set the file name to the variable "file"
echo $file >> $outputFile
Echo out/print the file name stored in the variable "file" to the outputFile
cat $file >> $outputFile
Take the contents of the file and stick it in the outputFile.
done
End the For Loop
There's some issues with this script though. If $outputFile or $file have a space in their name or path, then it will fail. It's good practice to toss double quotes around variables like:
cat "$file" >> "$outputFile"

#!/bin/bash
The shebang. If this script is executable an invoked directly as in ./this_script or found in the PATH, it will be invoked with /bin/bash.
outputFile=$1
Assign the first argument to the name outputFile.
++ find -name '*txt'
Recursively list all files with a name ending in "txt". It would be more standard to include the path and write this as find . -name '*.txt'.
+ … | grep data
Filter the previous list of file names. Only list those containing the string "data" in their names. This pipe could be eliminated by writing find . -name '*data*txt'.
for file in $(find -name '*txt' | grep data)
For every word in the output of the find | grep pipeline, assign that word to the name file and run the loop. This can break down if any of the found names have whitespace or glob characters in them. It would be better to use find's native -exec flag to handle this.
echo $file >> $outputFile
Append the expansion of the variable "file" to a new or existing file at the path found by expanding $outputFile. If the former expansion starts with a dash, it could cause echo to treat it as an argument. If the latter expansion has whitespace or a glob character in it, this may cause an "ambiguous redirect" error. It would be better to quote the expansions, and use printf to avoid the argument edge-case to echo, as in printf '%s\n' "$file" >> "$outputFile".
cat $file >> $outputFile
Append the contents of the file found at the expansion of the variable "file" to the path found by expanding $outputFile, or cause another ambiguous redirect error. It would be better to quote the expansions, like cat "$file" >> "$outputFile".
Assuming that none of the aforementioned expansion edge-cases were expected, it would be better to write this entire script like this:
find . -name '*data*txt' -print -exec cat {} \; >> "$1"

Related

Assign files in a list when using the bash find command

Im trying to see if I can assign the output of the find command to a variable. In this case it would be a list and iterate one file at a time to evaluate the file.
Ive tried this:
#!/bin/bash
PATH=/Users/mike/test
LIST='find $PATH -name *.txt'
for newfiles in #LIST; do
#checksize
echo $newfiles
done
My output is:
#LIST
Im trying to do the same this as the glob command in perl in bash.
var = glob "PATH/*.txt";

Use $(command) to execute command and substitute its output in place of that construct.
list=$(find "$PATH" -name '*.txt')
And to access a variable, put $ before the name, not # (your perl experience is showing).
for newfiles in $list; do
echo "$newfiles"
done
However, it's dangerous to parse the output of find like this, because you'll get incorrect results if any of the filenames contain whitespace -- it will be treated as multiple names. It's better to pipe the output:
find "$PATH" -name '*.txt' | while read -r newfiles; do
echo "$newfiles"
done
Also, notice that you should quote any variables that you don't want to be split into multiple words if they contain whitespace.
And avoid using all-uppercase variable names. This is conventionally reserved for environment variables.

LIST=$(find $PATH -name *.txt)
for newfiles in $LIST; do
Beware that you will have issues if any of the files have whitespace in the names.

Assuming you are using bash 4 or later, don't use find at all here.
shopt -s globstar nullglob
list=( "$path"/**/*.txt )
for newfile in "${list[#]}"; do
echo "$newfile"
done

Piping output of bash function

I'm trying to connect the inputs/outputs of two bash functions with a
pipe. Here is a complete program which illustrates my issue:
function print_info {
files=$(ls);
echo $files;
}
touch "file.pattern"
print_info | grep "pattern"
rm -f file.pattern
But this simply outputs a list of all files, not those that match
"pattern". Can anyone help me understand why?

The reason this isn't working is that in
echo $files;
the variable $files is subject to shell expansion (i.e., it is expanded into individual arguments to echo), and the resulting tokens are printed by echo delimited by spaces. This means that the output of it is a single line, and grep handles it accordingly.
The least invasive fix is to use
echo "$files";

Dont parse the output of ls command. You could do the same using find command like:
find . -maxdepth 1 -type f -exec grep "pattern" {} \;
if you are getting file names from a function then do it like:
grep "pattern" $(print_info)

Finding files in list using bash array loop

I'm trying to write a script that reads a file with filenames, and outputs whether or not those files were found in a directory.
Logically I'm thinking it goes like this:
$filelist = prompt for file with filenames
$directory = prompt for directory path where find command is performed
new Array[] = read $filelist line by line
for i, i > numberoflines, i++
if find Array[i] in $directory is false
echo "$i not found"
export to result.txt
I've been having a hard time getting Bash to do this, any ideas?

First, I would just assume that all the file-names are supplied on standard input. E.g., if the file names.txt contains the file-names and check.sh is the script, you can invoke it like
cat names.txt | ./script.sh
to obtain the desired behaviour (i.e., using the file-names from names.txt).
Second, inside script.sh you can loop as follows over all lines of the standard input
while read line
do
... # do your checks on $line here
done
Edit: I adapted my answer to use standard input instead of command line arguments, due to the problem indicated by #rici.

while read dirname
do
echo $dirname >> result.txt
while read filename
do
find $dirname -type f -name $filename >> result.txt
done <filenames.txt
done <dirnames.txt

convert a file path into string

I'm having an error trying to find a way to replace a string in a directory path with another string
sed: Error tryning to read from {directory_path}: It's a directory
The shell script
#!/bin/sh
R2K_SOURCE="source/"
R2K_PROCESSED="processed/"
R2K_TEMP_DIR=""
echo " Procesando archivos desde $R2K_SOURCE "
for file in $(find $R2K_SOURCE )
do
if [ -d $file ]
then
R2K_TEMP_DIR=$( sed 's/"$R2K_SOURCE"/"$R2K_PROCESSED"/g' $file )
echo "directorio $R2K_TEMP_DIR"
else
# some code executes
:
fi
done
# find $R2K_PROCCESED -type f -size -200c -delete
i'm understanding that the rror it's in this line
R2K_TEMP_DIR=$( sed 's/"$R2K_SOURCE"/"$R2K_PROCESSED"/g' $file )
but i don't know how to tell sh that treats $file variable as string and not as a directory object.

If you want ot replace part of path name you can echo path name and take it to sed over pipe.
Also you must enable globbing by placing sed commands into double quotes instead of single and change separator for 's' command like that:
R2K_TEMP_DIR=$(echo "$file" | sed "s:$R2K_SOURCE:$R2K_PROCESSED:g")
Then you will be able to operate with slashes inside 's' command.
Update:
Even better is to remove useless echo and use "here is string" instead:
R2K_TEMP_DIR=$(sed "s:$R2K_SOURCE:$R2K_PROCESSED:g" <<< "$file")

First, don't use:
for item in $(find ...)
because you might overload the command line. Besides, the for loop cannot start until the process in $(...) finishes. Instead:
find ... | while read item
You also need to watch out for funky file names. The for loop will cough on all files with spaces in them. THe find | while will work as long as files only have a single space in their name and not double spaces. Better:
find ... -print0 | while read -d '' -r item
This will put nulls between file names, and read will break on those nulls. This way, files with spaces, tabs, new lines, or anything else that could cause problems can be read without problems.
Your sed line is:
R2K_TEMP_DIR=$( sed 's/"$R2K_SOURCE"/"$R2K_PROCESSED"/g' $file )
What this is attempting to do is edit your $file which is a directory. What you want to do is munge the directory name itself. Therefore, you have to echo the name into sed as a pipe:
R2K_TEMP_DIR=$(echo $file | sed 's/"$R2K_SOURCE"/"$R2K_PROCESSED"/g')
However, you might be better off using environment variable parameters to filter your environment variable.
Basically, you have a directory called source/ and all of the files you're looking for are under that directory. You simply want to change:
source/foo/bar
to
processed/foo/bar
You could do something like this ${file#source/}. The # says this is a left side filter and it will remove the least amount to match the glob expression after the #. Check the manpage for bash and look under Parameter Expansion.
This, you could do something like this:
#!/bin/sh
R2K_SOURCE="source/"
R2K_PROCESSED="processed/"
R2K_TEMP_DIR=""
echo " Procesando archivos desde $R2K_SOURCE "
find $R2K_SOURCE -print0 | while read -d '' -r file
do
if [ -d $file ]
then
R2K_TEMP_DIR="processed/${file#source/}"
echo "directorio $R2K_TEMP_DIR"
else
# some code executes
:
fi
done
R2K_TEMP_DIR="processed/${file#source/}" removes the source/ from the start of $file and you merely prepend processed/ in its place.
Even better, it's way more efficient. In your original script, the $(..) creates another shell process to run your echo in which then pipes out to another process to run sed. (Assuming you use loentar's solution). You no longer have any subprocesses running. The whole modification of your directory name is internal.
By the way, this should also work too:
R2K_TEMP_DIR="$R2K_PROCESSED/${file#$R2K_SOURCE}"
I just didn't test that.

Concatenation in shell script and use of 'basename'

I want to read all file names form a particular directory and then create new files with those names by appending some string to them in another directory.
e.g > 'A', 'B', 'C' are in 'logs' directory
then script should create 'A_tmp', 'B_tmp', 'C_tmp' in 'tmp' directory
what i am using is -
tempDir=./tmp/
logDir=./logs/
for file in $( find `echo $logDir` -type f )
do
name=eval basename $file
echo $name
name=$(echo $name | sed 's/.$//')
echo $tempDir
opFile=$tempDir$name
echo $opFile
done
But what I understood is, $file is containing '\n' as last character and I am unable to concatenate the string.
right now I am not creating files, just printing all the names.
So, how I can remove the '\n' from the file name, and is my understanding correct ?

Analysis
There are multiple issues to address in your script. Let's take it step by step:
tempDir=./tmp/
logDir=./logs/
for file in $( find `echo $logDir` -type f )
This scheme assumes no spaces in the file names (which is not an unusual restriction; avoiding problems with spaces in names is relatively tricky). Also, there's no need for the echo; just write:
for file in $(find "$logDir" -type f)
Continuing:
do
name=eval basename $file
This runs the basename command with the environment variable name set to the value eval and the argument $file. What you need here is:
name=$(basename "$file")
where the double quotes aren't strictly necessary because the name can't contain spaces (but it's not a bad habit to get into to quote all file names because sometimes the names do contain spaces).
echo $name
This would echo a blank line because name was not set.
name=$(echo $name | sed 's/.$//')
If name was set, this would chop off the last character, but if the name was A, you'd have nothing left.
echo $tempDir
opFile=$tempDir$name
echo $opFile
done
Give or take double quotes and the fact that you've not added the _tmp suffix to opFile, there's nothing wrong with the rest.
Synthesis
Putting the changes together, you end up with:
tempDir=./tmp/
logDir=./logs/
for file in $(find "$logDir" -type f)
do
name=$(basename "$file")
echo "$name" # Debug only
echo "$tempDir" # Debug only
opFile="$tempDir${name}_tmp"
echo "$opFile"
done
That shows all the intermediate results. You could perfectly well compress that down to:
tempDir=./tmp/
logDir=./logs/
for file in $(find "$logDir" -type f)
do
opFile="$tempDir"$(basename "$file")"_tmp"
echo "$opFile"
done
Or, using a simpler combination of double quotes because the names contain no spaces:
tempDir=./tmp/
logDir=./logs/
for file in $(find "$logDir" -type f)
do
opFile="$tempDir$(basename $file)_tmp"
echo "$opFile"
done
The echo is there as a surrogate for the copy or move operation you plan to execute, of course.
EDIT: ...and to remove restrictions on file names containing spaces and globbing characters, do it as:
tempDir=./tmp/
logDir=./logs/
find "$logDir" -type f |
while IFS= read -r file
do
opFile="${tempDir}${file##*/}_tmp"
echo "$opFile"
done
It will still fail for file names containing newlines. If you want to handle that then investigate a solution using find ... -print0 | xargs -0 or find ... -exec.

Try the following.
#!/bin/sh
tmpDir=./tmp/
logDir=./logs/
# list all files in log directory, pipe into a loop that reads each path line
# by line..
# Also note that there is no newline in this case since it is swallowed by 'read'.
find $logDir -type f | while read path; do
# get the basename of the path
name=`basename $path`
# copy the found file to the temporary directory.
dest="$tmpDir/${name}_tmp"
echo $dest
done
Shell scripts have the ability to concatenate strings easily in statements, as demonstrated with $tmpDir/${name}_tmp, there is no need for replacing the output since read swallows any newlines.
find ... while read is a very useful construct when you want to read multiple lines of anything, it even works for files.
while read line; do
echo $line
done < filename.txt
Edit: clarified

Try something like this:
tempDir=./tmp/
logDir=./logs/
for file in $( find `echo $logDir` -type f )
do
name=`eval basename $file|tr -d "\n"`_tmp
echo $name
done

If you change
name=eval basename $file
to
name=`eval basename $file`
then afterwads name contains what you want.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio