Bash script read specifc value from files of an entire folder - bash

I have a problem creating a script that reads specific value from all the files of an entire folder
I have a number of email files in a directory and I need to extract from each file, 2 specific values.
After that I have to put them into a new file that looks like that:
--------------
To: value1
value2
--------------
This is what I want to do, but I don't know how to create the script:
# I am putting the name of the files into a temp file
`ls -l | awk '{print $9 }' >tmpfile`
# use for the name of a file
`date=`date +"%T"
# The first specific value from file (phone number)
var1=`cat tmpfile | grep "To: 0" | awk '{print $2 }' | cut -b -10 `
# The second specific value from file(subject)
var2=cat file | grep Subject | awk '{print $2$3$4$5$6$7$8$9$10 }'
# Put the first value in a new file on the first row
echo "To: 4"$var1"" > sms-$date
# Put the second value in the same file on the second row
echo ""$var2"" >>sms-$date
.......
and do the same for every file in the directory
I tried using while and for functions but I couldn't finalize the script
Thank You

I've made a few changes to your script, hopefully they will be useful to you:
#!/bin/bash
for file in *; do
var1=$(awk '/To: 0/ {print substr($2,0,10)}' "$file")
var2=$(awk '/Subject/ {for (i=2; i<=10; ++i) s=s$i; print s}' "$file")
outfile="sms-"$(date +"%T")
i=0
while [ -f "$outfile" ]; do outfile="sms-$date-"$((i++)); done
echo "To: 4$var1" > "$outfile"
echo "$var2" >> "$outfile"
done
The for loop just goes through every file in the folder that you run the script from.
I have added added an additional suffix $i to the end of the file name. If no file with the same date already exists, then the file will be created without the suffix. Otherwise the value of $i will keep increasing until there is no file with the same name.
I'm using $( ) rather than backticks, this is just a personal preference but it can be clearer in my opinion, especially when there are other quotes about.
There's not usually any need to pipe the output of grep to awk. You can do the search in awk using the / / syntax.
I have removed the cut -b -10 and replaced it with substr($2, 0, 10), which prints the first 10 characters from column 2.
It's not much shorter but I used a loop rather than the $2$3..., I think it looks a bit neater.
There's no need for all the extra " in the two output lines.

I sugest to try the following:
#!/bin/sh
RESULT_FILE=sms-`date +"%T"`
DIR=.
fgrep -l 'To: 0' "$DIR" | while read FILE; do
var1=`fgrep 'To: 0' "$FILE" | awk '{print $2 }' | cut -b -10`
var2=`fgrep 'Subject' "$FILE" | awk '{print $2$3$4$5$6$7$8$9$10 }'`
echo "To: 4$var1" >>"$RESULT_FIL"
echo "$var2" >>"$RESULT_FIL"
done

Related

Why am I getting this file error and a cat error even though I am passing the text file in the argument?

The program calls for us to read in the directory full of text files, parse data from those files into their respective attributes.
Then once the data is set, load a general template which has those attributes in the text.
I'm using a sed command to replace the specific attributes, only if the number of students is greater than 50. If so it runs the sed command and writes to a file, and into a directory.
But i am getting this error when I'm passing
test3.sh ./data assign4.template 12/16/2021 ./output
Error
cat: assign4.template: No such file or directory
test3.sh: line 62: output/MAT3103.crs: No such file or directory
The current file is MAT4353.crs
Now what I am thinking is that, for the file or directory error, it is looking in that folder and searching for a file named that
But Not entirely sure how to resolve that.
As for the cat: template error, I don't get that since I am passing the template in the terminal
As for the other paramaters being passed, the Date which is also substituted in the sed command, All output files should be written to the directory defined by the last argument. This directory may or may not already exist. Each file should be named by the course’s department code and number,and with the extension.warn
Here is the total code
#!/bin/bash
# checking if user has passed atleast four arguments are passed
if [ $# -ne 4 ]
then
echo "Atleast 4 argument should be passed"
exit 1
fi
# if output directory exits check
if [ -d output ]
then
# if output directory exists will get deleted
echo "output directory already exists. So removing its contents"
rm -f output/*
else
# output directory does not exist, so gets created here
echo "output directory does not exist. So creating a new directory"
mkdir output
fi
max_students=50
template=$2
dt=$3
cd $1
for i in *; do
echo The current file is ${i}
dept_code=$(awk 'NR==2
{print $1 ; exit}' $i)
echo $dept_code
dept_name=$(awk 'NR==2
{print $2 ; exit}' $i)
echo $dept_name
course_name=$(awk 'FNR==2' $i)
echo $course_name
course_sched=$(awk 'FNR==3' $i | awk '{print $1}')
course_sched=$(awk 'FNR==3' $i | awk '{print $1}')
echo $course_sched
course_start=$(awk 'FNR==3' $i | awk '{print $2}')
echo $course_start
course_end=$(awk 'FNR==3' $i | awk '{print $3}')
echo $course_end
credit_hours=$(awk 'FNR==4' $i)
echo $credit_hours
num_students=$(awk 'FNR==5' $i)
echo $num_students
# checking if number of students currently enrolled > max students
if (( $(echo "$num_students > $max_students" |bc -l) ))
then
# output filename creation
out_file=${i}
# using example Template and sed command to replace the variables
cat $template | sed -e "s/\[\\[\dept_code\]\]/$dept_code/" | sed -e "s/\[\\[\dept_name\]\]/$dept_name/" | sed -e "s|\[\[course_name\]\]|$course_name|" | sed -e "s|\[\[course_start\]\]|$$
fi
done
You define the variable as
template=$2
and since your second parameter is assign4.template, this is what the variable template is set to. Then you do a
cat $template
which is, first of all, unnecessary, since you can do an input redirection on sed instead, but most of all requires, that the file exists in your working directory. Since you have done before a
cd $1
it means that the file data/assign4.template does not exist. You have to create this file before you can use your script.
use single quotes in your positional arguments.
test3.sh './data' 'assign4.template' '12/16/2021' './output'
or
test3.sh data assign4.template '12/16/2021' output

Breakdown a string into two arrays using awk/sed/grep

Create two different arrays in shell/bash script from content in a text file which has details about different files. How do I extract directories into one array and filenames to another array, using awk/sed/grep?
I have a text file as show below
2017-02-04 07:18 /temp/folder1/filename_20170204_something.txt
2017-03-04 07:18 /temp/folder2/filename_20170204_20170304.txt
2017-04-04 07:18 /temp/folder3/filename_20170404_.txt
directories_list= {folder1,folder2,folder3}
file_list = {filename_20170204.txt,filename_20170304.txt,filename_20170404.txt}
I would use awk to split the lines into columns, then print the column number for the folder and the column number for the file. You can tell awk what the delimiting character is with the -F option.
This script stores the folders in one array and the files in another.
#!/bin/bash
FOLDERS=() # declares FOLDERS as an array
FILES=() # declares FILES as an array
INPUT=input.txt # change to the path of your data file
while read LINE
do
FOLDER=$(echo $LINE | awk -F / '{print $3}')
FILE=$(echo $LINE | awk -F / '{print $4}')
echo "Reading next line..."
echo FOLDER: $FOLDER
echo FILE: $FILE
echo ""
FOLDERS+=( "$FOLDER" ) # appends $FOLDERS to the FOLDERS array
FILES+=( "$FILE" ) # appends $FILE to FILES array
done < $INPUT
# Now the FOLDERS array and FILES array have what you want
echo FOLDERS array: ${FOLDERS[#]}
echo FILES array: ${FILES[#]}
That's assuming you have the input.txt file in the same directory and it contains your sample data.
Read the file line by line, split with read and IFK, use basename and dirname, and read in the array:
cat <<EOF >file
2017-02-04 07:18 /temp/folder1/filename_20170204_something.txt
2017-03-04 07:18 /temp/folder2/filename_20170204_20170304.txt
2017-04-04 07:18 /temp/folder3/filename_20170404_.txt
EOF
dirs=() files=()
while IFS=' ' read -r _ _ path; do
dirs+=("$(basename "$(dirname "$path")")")
files+=("$(basename "$path")")
done <file
declare -p dirs files
How do you want to handle the duplicate entries in the array & is there any specific order that you want to save the files.
If not you can use the below commands, (removes duplicate entries & sorts based on filenames)
folders=()
files=()
folders=`awk '{print $NF}' <INPUT_FILE> | awk -F'/' '{print $(NF-1)}' | sort -nr | uniq`
files=`awk '{print $NF}' <INPUT_FILE> | awk -F'/' '{print $NF}' | sort -nr | uniq`
Below is the explanation for the awk commands,
awk '{print $NF}' <INPUT_FILE> -> takes the last field in the input file
awk -F'/' '{print $(NF-1)}' -> cuts the last filed with / as delimiter and takes the penultimate column
Hope this helps !

Bash Shell: Infinite Loop

The problem is the following I have a file that each line has this form:
id|lastName|firstName|gender|birthday|joinDate|IP|browser
i want to sort alphabetically all the firstnames in that file and print them one on each line but each name only once
i have created the following program but for some reason it creates an infinite loop:
array1=()
while read LINE
do
if [ ${LINE:0:1} != '#' ]
then
IFS="|"
array=($LINE)
if [[ "${array1[#]}" != "${array[2]}" ]]
then
array1+=("${array[2]}")
fi
fi
done < $3
echo ${array1[#]} | awk 'BEGIN{RS=" ";} {print $1}' | sort
NOTES
if [ ${LINE:0:1} != '#' ] : this command is used because there are comments in the file that i dont want to print
$3 : filename
array1 : is used for all the seperate names
Wow, there's a MUCH simpler and cleaner way to achieve this, without having to mess with the IFS variable or using arrays. You can use "for" to do this:
First I created a file with the same structure as yours:
$ cat file
id|lastName|Douglas|gender|birthday|joinDate|IP|browser
id|lastName|Tim|gender|birthday|joinDate|IP|browser
id|lastName|Andrew|gender|birthday|joinDate|IP|browser
id|lastName|Sasha|gender|birthday|joinDate|IP|browser
#id|lastName|Carly|gender|birthday|joinDate|IP|browser
id|lastName|Madson|gender|birthday|joinDate|IP|browser
Here's the script I wrote using "for":
#!/bin/bash
for LINE in `cat file | grep -v "^#" | awk -F'|' '{print$3}' | sort -u`
do
echo $LINE
done
And here's the output of this script:
$ ./script.sh
Andrew
Douglas
Madson
Sasha
Tim
Explanation:
for LINE in `cat file`
Creates a loop that reads each line of "file". The commands between ` are run by linux, for example, if you wanted to store the date inside of a variable you could use "VARDATE=`date`".
grep -v "^#"
The option -v is used to exclude results matching the pattern, in this case the pattern is "^#". The "^" character means "line begins with". So grep -v "^#" means "exclude lines beginning with #".
awk -F'|' '{print$3}'
The -F option switches the column delimiter from the default (the default is a space) to whatever you put between ' after it, in this case the "|" character.
The '{print$3}' prints the 3rd column.
sort -u
And the "sort -u" command to sort the names alphabetically.

copy files from mount point listed in a csv

I need to move over 100,000 img's from 1 server to another via a mount point, i have a .csv with them listed and im looking to script it
the csv looks like this
"images1\002_0001\thumb",53717902.jpg,/www/images/002_0001/thumb/
"images1\002_0001\thumb",53717901.jpg,/www/images/002_0001/thumb/
"images1\002_0001\thumb",53717900.jpg,/www/images/002_0001/thumb/
comma separated we have source name and destination
I was thinking of using awk to create each as a variable
SOURCE=`awk -F ',' '{ print $1 }' test.csv`
IMGNAME=`awk -F ',' '{ print $2 }' test.csv`
DEST=`awk -F ',' '{ print $3 }' test.csv`
this is where im getting stuck, my loop
while read line
do
cp $SOURCE${IMGNAME} $DEST
done <test.csv
this has copied the first name it finds into all the directories
You could use what you have and move the variable declaration into the loop referencing $line, or you could use IFS, as suggested below.
while IFS=, read -r src filename dest
do
cp $src${filename} $dest
done <test.csv
There are many way to do it, some example
If you have no spaces in the directories string: you can do even from shell
sed -E 's/"/cp /; s/",/\// ; s/,/ /;s/\\/\//g' test.csv | /bin/bash
It better if check it before you try. You speak about a lot of files...
sed -E 's/"/cp /; s/",/\// ; s/,/ /;s/\\/\//g' test.csv | less
It can happen that you have spaces in the string of the directory name like My Windows Like Dir Name. In this case you need double quotes (there are the double quote even for this reason maybe...)
You can do it using only awk(always from the shell)
awk -F',' '{gsub(/"/, "", $1); gsub(/\\/, "/", $1); print "cp \""$1"/" $2"\" \"" $3"\""}' test.csv | /bin/bash
or that is equivalent
awk -F',' '{gsub(/"/, "", $1); gsub(/\\/, "/", $1); printf ("cp \"%s/%s\" \"%s\"\n",$1,$2,$3)}' test.csv | /bin/bash
Check it always in advance, avoiding the last pipe |/bin/bash, putting maybe | head -n 10 to have only the first 10 lines.
The script can be written:
while IFS=, read -r SOURCE IMGNAME DEST
do
SOURCE=( ${SOURCE//\\/\/} ) # Here you need to change "\" in "/"
SOURCE=( ${SOURCE//\"/} ) # Here I like to kill ""
cp "${SOURCE}/${IMGNAME}" "$DEST" # Here I put again ""
done <test.csv
Note: I think you need to change "\" windows style in "/" unix style. So I required to the substitution rules.

using awk within loop to replace field

I have written a script finding the hash value from a dictionary and outputting it in the form "word:md5sum" for each word. I then have a file of names which I would like to use to place each name followed by every hash value i.e.
tom:word1hash
tom:word2hash
.
.
bob:word1hash
and so on. Everything works fine but I can not figure out the substitution. Here is my script.
$#!/bin/bash
#/etc/dictionaries-common/words
cat words.txt | while read line; do echo -n "$line:" >> dbHashFile.txt
echo "$line" | md5sum | sed 's/[ ]-//g' >> dbHashFile.txt; done
cat users.txt | while read name
do
cat dbHashFile.txt >> nameHash.txt;
awk '{$1="$name"}' nameHash.txt;
cat nameHash.txt >> dbHash.txt;
done
the line
$awk '{$1="$name"}' nameHash.txt;
is where I attempt to do the substitution.
thank you for your help
Try replacing the entire contents of the last loop (both cats and the awk) with:
awk -v name="$name" -F ':' '{ print name ":" $2 }' dbHashFile.txt >>dbHash.txt

Resources