Bash script help/evaluation - bash

I'm trying to learn some scripting however I can't find solution for one functionality.
Basically I would like to ask to evaluate my script as it's probably possible to reduce the complexity and number of lines.
The purpose of this script is to download random, encrypted MySQL backups from Amazon S3, restore the dump and run some random MySQL queries.
I'm not sure how to email the output from printf statements - one is for headers and second one for actual data. I've tried to format the output so it looks like below but I had to exclude the headers from the loop:
Database: Table: Entries:
database1 random_table 0
database2 random_table 0
database3 random_table 0
database4 random_table 0
I would like to include this output in the email and also change the email subject based on the success/failure of the script.
I probably use to much if loops and MySQL queries are probably to complicated.
Script:
#!/usr/bin/env bash
# DB Details:
db_user="user"
db_pass="password"
db_host="localhost"
# Date
date_stamp=$(date +%d%m%Y)
# Initial Setup
data_dir="/tmp/backup"
# Checks
if [ ! -e /usr/bin/s3cmd ]; then
echo "Required package (http://s3tools.org/s3cmd)"
exit 2
fi
if [ -e /usr/bin/gpg ]; then
gpg_key=$(gpg -K | tr -d "{<,>}" | awk '/an#example.com/ { print $4 }')
if [ "$gpg_key" != "an#example.com" ]; then
echo "No GPG key"
exit 2
fi
else
echo "No GPG package"
exit 2
fi
if [ -d $data_dir ]; then
rm -rf $data_dir/* && chmod 700 $data_dir
else
mkdir $data_dir && chmod 700 $data_dir
fi
# S3 buckets
bucket_1=s3://test/
# Download backup
for backup in $(s3cmd ls s3://test/ | awk '{ print $2 }')
do
latest=$(s3cmd ls $backup | awk '{ print $2 }' | sed -n '$p')
random=$(s3cmd ls $latest | shuf | awk '{ print $4 }' | sed -n '1p')
s3cmd get $random $data_dir >/dev/null 2>&1
done
# Decrypting Files
for file in $(ls -A $data_dir)
do
filename=$(echo $file | sed 's/\.e//')
gpg --out $data_dir/$filename --decrypt $data_dir/$file >/dev/null 2>&1 && rm -f $data_dir/$file
if [ $? -eq 0 ]; then
# Decompressing Files
bzip2 -d $data_dir/$filename
if [ $? -ne 0 ]; then
echo "Decompression Failed!"
fi
else
echo "Decryption Failed!"
exit 2
fi
done
# MySQL Restore
printf "%-40s%-30s%-30s\n\n" Database: Table: Entries:
for dump in $(ls -A $data_dir)
do
mysql -h $db_host -u $db_user -p$db_pass < $data_dir/$dump
if [ $? -eq 0 ]; then
# Random DBs query
db=$(echo $dump | sed 's/\.sql//')
random_table=$(mysql -h $db_host -u $db_user -p$db_pass $db -e "SHOW TABLES" | grep -v 'Tables' | shuf | sed -n '1p')
db_entries=$(mysql -h $db_host -u $db_user -p$db_pass $db -e "SELECT * FROM $random_table" | grep -v 'id' | wc -l)
printf "%-40s%-30s%-30s\n" $db $random_table $db_entries
mysql -h $db_host -u $db_user -p$db_pass -e "DROP DATABASE $db"
else
echo "The system was unable to restore backups!"
rm -rf $data_dir
exit 2
fi
done
#Remove backups
rm -rf $data_dir

You'll get the best answers if you ask specific questions (rather than, "please review my code")...and if you limit each post to a single question. Regarding emailing the output of your printf statements:
You can group statements into a block and then pipe the output of a block into another program. For example:
{
echo "This is a header"
echo
for x in {1..10}; do
echo "This is row $x"
done
} | mail -s "Here is my output" lars#example.com
If you want to make the email subject conditional upon the success or
failure of something elsewhere in the script, you can (a) save your
output to a file, and then (b) email the file after building the
subject line:
{
echo "This is a header"
echo
for x in {1..10}; do
echo "This is row $x"
done
} > output
if is_success; then
subject="SUCCESS: Here is your output"
else
subject="FAILURE: Here are your errors"
fi
mail -s "$subject" lars#example.com < output

Related

df -h if freespace equals then | bash

The error that I am getting is with the df command on mac, I would like to specify a value in gigabytes and perhaps allow the user to choose a different drive if / is full.
destination="$HOME/Desktop/sandbox"
if [ $(df -h --output=avail /|tail -n1) -lt 300000 ]; then
echo "There is less than 300GB available..." ;
exit
else
for files in *.tar ; do echo copying "$files" ; cp "$files" "$destination" ; read -n 1 -p "Press any key..." ; done
fi
Not sure if df -h / | tail -1 | awk '{print $4}' | sed 's/..$//' is a good option
destination="$HOME/Desktop/sandbox"
freespace="$(df -h / | tail -1 | awk '{print $4}' | sed 's/..$//')"
if [ "$freespace" -lt 300 ]; then
echo "There is less than 300GB available..." ;
exit
else
for files in *.sh ; do echo copying "$files" ; cp "$files" "$destination" ; read -n 1 -p "Press any key..." ; done
fi
The problem is that you are using dh -h what means "human readable" and always append unit after the number (M, G, k...). There is a lot of options how to cut it out by grep, awk, cut or sed. But perhaps the best way is to use different option e.g. df -m to have an output in megabytes and multiply the required space.
Also I would propose to add a grep $destination (mountpoint visible in df output) to filter the required FS and to not rely that it's always last line in the output.
Try something like this:
destination="$HOME/Desktop/sandbox"
free_space=$(df -h | grep $destination | awk '{print $4}')
if [ $free_space -lt $((300 * 1024)) ]; then
echo "There is less than 300GB available..." ;
exit
else
for files in *.tar; do
echo copying "$files"
cp "$files" "$destination"
read -n 1 -p "Press any key..."
done
fi

Invalidate metadata gives TableNotFoundException although I can see the table in the list of tables

I am trying to run a script that validates if all the tables that are supposed to be created as a part of my deployment. This is my script:
set -x
ENV=$1
. /user/setenv.sh
ticket=""
wlf_kinit ticket
echo "Checking if Kerberos Ticket is available.."
klist $ticket
if [ $? -eq 1 ]; then
echo "Kerberos Ticket not found..Exiting"
exit 1
fi
echo "Validating Hadoop tables" > /tmp/psk1/db_validation_log.txt
db_dir=/user/db
cd $db_dir
for current_directory in `find . -maxdepth 1 -type d`
do
#current_directory=`echo $current_directory | awk -F '/' '{print $2}'`
echo $current_directory
if [ "$current_directory" != "." ]; then
current_directory=`echo $current_directory | awk -F '/' '{print $2}'`
if [ "${current_directory:0:1}" = "v" ]; then
dir=$db_dir/$current_directory/views
else
dir=$db_dir/$current_directory/tables
fi
cd $dir
pwd
#echo "Validating tables in "$current_directory >> /tmp/psk1/db_validation_log.txt
find . -name '*.hql' | while read rec; do
echo $rec
tbl_name=`echo $rec | awk -F '/' '{print $2}' | awk -F '.' '{print $1}'`
result=$(impala-shell --quiet --delimited --ssl -i ${impala_host} -ku ${user_id}${impala_realm} -q "set request_pool = ${request_pool}; use $current_directory$ENV; invalidate metadata $tbl_name; show tables like '$tbl_name';") 2>> /tmp/psk1/db_validation_log.txt
#echo $result
if [ $? -eq 0 ]; then
if [ ${result} == ${tbl_name} ]; then
echo "$tbl_name exists" #>> /tmp/psk1/db_validation_log.txt
else
echo "$tbl_name does not exist" #>> /tmp/psk1/db_validation_log.txt
fi
else
echo $current_directory$ENV"."$tbl_name" Query error" >> /tmp/psk1/db_validation_log.txt
impala-shell --quiet --delimited --ssl -i ${impala_host} -ku ${user_id}${impala_realm} -q "set request_pool = ${request_pool}; use $current_directory$ENV; invalidate metadata $tbl_name; show tables like '$tbl_name';" 2>> /tmp/psk1/db_validation_log.txt
fi
done
fi
done
cat /tmp/psk1/db_validation_log.txt | mail -a /tmp/psk1/db_validation_log.txt -s 'Hadoop DB validation completed. Check the attached log.' -r ${from_email} ${to_email} 2>> /tmp/psk1/db_validation_log.txt
kdestroy -c $ticket
This script prints the failed queries to a .txt file and sends it in an email. In the text file I see some of the queries failing with TableNotFoundException. But when I open the impala shell and list out the tables, I am able to see the table in the list. I am not sure what is causing this inconsistency.
Any help would be appreciated. Thank you.

Grep inside bash script not finding item

I have a script which is checking a key in one file against a key in another to see if it exists in both. However in the script the grep never returns anything has been found but on the command line it does.
#!/bin/bash
# First arg is the csv file of repo keys separated by line and in
# this manner 'customername,REPOKEY'
# Second arg is the log file to search through
log_file=$2
csv_file=$1
while read line;
do
customer=`echo "$line" | cut -d ',' -f 1`
repo_key=`echo "$line" | cut -d ',' -f 2`
if [ `grep "$repo_key" $log_file` ]; then
echo "1"
else
echo "0"
fi
done < $csv_file
The CSV file is formatted as follows:
customername,REPOKEY
and the log file is as follows:
REPOKEY
REPOKEY
REPOKEY
etc
I call the script by doing ./script csvfile.csv logfile.txt
Rather then checking output of grep command use grep -q to check its return status:
if grep -q "$repo_key" "$log_file"; then
echo "1"
else
echo "0"
fi
Also your script can be simplified to:
log_file=$2
csv_file=$1
while IFS=, read -r customer repo_key; do
if grep -q "$repo_key" "$log_file"; then
echo "1"
else
echo "0"
fi
done < "$csv_file"
use the exit status of the grep command to print 1 or 0
repo_key=`echo "$line" | cut -d ',' -f 2`
grep -q "$repo_key" $log_file
if [ $? -eq 1 ]; then
echo "1"
else
echo "0"
fi
-q supresses the output so that no output is printed
$? is the exit status of grep command 1 on successfull match and 0 on unsuccessfull
you can have a much simpler version as
grep -q "$repo_key" $log_file
echo $?
which will produce the same output

Syntax error: “(” unexpected (expecting “fi”)

filein="users.csv"
IFS=$'\n'
if [ ! -f "$filein" ]
then
echo "Cannot find file $filein"
else
#...
groups=(`cut -d: -f 6 "$filein" | sed 's/ //'`)
fullnames=(`cut -d: -f 1 "$filein"`)
userid=(`cut -d: -f 2 "$filein"`)
usernames=(`cut -d: -f 1 "$filein" | tr [A-Z] [a-z] | awk '{print substr($1,1,1) $2}'`)
#...
for group in ${groups[*]}
do
grep -q "^$group" /etc/group ; let x=$?
if [ $x -eq 1 ]
then
groupadd "$group"
fi
done
#...
x=0
created=0
for user in ${usernames[*]}
do
useradd -n -c ${fullnames[$x]} -g "${groups[$x]}" $user 2> /dev/null
if [ $? -eq 0 ]
then
let created=$created+1
fi
#...
echo "${userid[$x]}" | passwd --stdin "$user" > /dev/null
#...
echo "Welcome! Your account has been created. Your username is $user and temporary
password is \"$password\" without the quotes." | mail -s "New Account for $user" -b root $user
x=$x+1
echo -n "..."
sleep .25
done
sleep .25
echo " "
echo "Complete. $created accounts have been created."
fi
I'm guessing the problem is that you're trying to capture command output in arrays without actually using command substitution. You want something like this:
groups=( $( cut... ) )
Note the extra set of parentheses with $ in front of the inner set.

Bash script help/evaluation

I'm trying to learn some scripting however I can't find solution for one functionality.
Basically I would like to ask to evaluate my script as it's probably possible to reduce the complexity and number of lines.
The purpose of this script is to download random, encrypted MySQL backups from Amazon S3, restore the dump and run some random MySQL queries.
I'm not sure how to email the output from printf statements - one is for headers and second one for actual data. I've tried to format the output so it looks like below but I had to exclude the headers from the loop:
Database: Table: Entries:
database1 random_table 0
database2 random_table 0
database3 random_table 0
database4 random_table 0
I would like to include this output in the email and also change the email subject based on the success/failure of the script.
I probably use to much if loops and MySQL queries are probably to complicated.
Script:
#!/usr/bin/env bash
# DB Details:
db_user="user"
db_pass="password"
db_host="localhost"
# Date
date_stamp=$(date +%d%m%Y)
# Initial Setup
data_dir="/tmp/backup"
# Checks
if [ ! -e /usr/bin/s3cmd ]; then
echo "Required package (http://s3tools.org/s3cmd)"
exit 2
fi
if [ -e /usr/bin/gpg ]; then
gpg_key=$(gpg -K | tr -d "{<,>}" | awk '/an#example.com/ { print $4 }')
if [ "$gpg_key" != "an#example.com" ]; then
echo "No GPG key"
exit 2
fi
else
echo "No GPG package"
exit 2
fi
if [ -d $data_dir ]; then
rm -rf $data_dir/* && chmod 700 $data_dir
else
mkdir $data_dir && chmod 700 $data_dir
fi
# S3 buckets
bucket_1=s3://test/
# Download backup
for backup in $(s3cmd ls s3://test/ | awk '{ print $2 }')
do
latest=$(s3cmd ls $backup | awk '{ print $2 }' | sed -n '$p')
random=$(s3cmd ls $latest | shuf | awk '{ print $4 }' | sed -n '1p')
s3cmd get $random $data_dir >/dev/null 2>&1
done
# Decrypting Files
for file in $(ls -A $data_dir)
do
filename=$(echo $file | sed 's/\.e//')
gpg --out $data_dir/$filename --decrypt $data_dir/$file >/dev/null 2>&1 && rm -f $data_dir/$file
if [ $? -eq 0 ]; then
# Decompressing Files
bzip2 -d $data_dir/$filename
if [ $? -ne 0 ]; then
echo "Decompression Failed!"
fi
else
echo "Decryption Failed!"
exit 2
fi
done
# MySQL Restore
printf "%-40s%-30s%-30s\n\n" Database: Table: Entries:
for dump in $(ls -A $data_dir)
do
mysql -h $db_host -u $db_user -p$db_pass < $data_dir/$dump
if [ $? -eq 0 ]; then
# Random DBs query
db=$(echo $dump | sed 's/\.sql//')
random_table=$(mysql -h $db_host -u $db_user -p$db_pass $db -e "SHOW TABLES" | grep -v 'Tables' | shuf | sed -n '1p')
db_entries=$(mysql -h $db_host -u $db_user -p$db_pass $db -e "SELECT * FROM $random_table" | grep -v 'id' | wc -l)
printf "%-40s%-30s%-30s\n" $db $random_table $db_entries
mysql -h $db_host -u $db_user -p$db_pass -e "DROP DATABASE $db"
else
echo "The system was unable to restore backups!"
rm -rf $data_dir
exit 2
fi
done
#Remove backups
rm -rf $data_dir
move out of the loop :
random_tables=$(mysql -h $db_host -u $db_user -p$db_pass $db -e "SHOW TABLES" | grep -v 'Tables')
table_nb=$(wc -l <<<"$random_tables")
and in the loop
random_table=$(sed -n $((RANDOM%table_nb+1))p <<<"$random_tables")
A remark $? is the status of latest command executed so after && rm it will not be the status of decrypt

Resources