How to grep the output of a command inside a shell script when scheduling using cron - shell

I have a simple shell script where I need to check if my EMR job is running or not and I am just printing a log but it does not seem to work properly when scheduling the script using cron as it always prints the if block statement because the value of "status_live" var is always empty so if anyone can suggest what is wrong here otherwise on manually running the script it works properly.
#!/bin/sh
status_live=$(yarn application -list | grep -i "Streaming App")
if [ -z $status_live ]
then
echo "Running spark streaming job again at: "$(date) &
else
echo "Spark Streaming job is running, at: "$(date)
fi

Your script cannot run in cron because cron script has no environment context at all.
For example try to run your script as another use nobody that has no shell.
sudo -u nobody <script-full-path>
It will fail because it has no environment context.
The solution is to add your user environment context to your script. Just add source to your .bash_profile
sed -i "2a source $HOME/.bash_profile" <script-full-path>
Your script should look like:
#!/bin/sh
source /home/<your user name>/.bash_profile
status_live=$(yarn application -list | grep -i "Streaming App")
if [ -z $status_live ]
then
echo "Running spark streaming job again at: "$(date) &
else
echo "Spark Streaming job is running, at: "$(date)
fi
Now try to run it again with user nobody, if it works than cron will work as well.
sudo -u nobody <script-full-path>
Note that cron has no standard output. and you will need to redirect standard output from your script to a log file.
<script-full-path> >> <logfile-full-path>

# $? will have the last command status in bash shell scripting
# your complete command here below and status_live is 0 if it finds in grep (i.e. true in shell if condition.)
yarn application -list | grep -i "Streaming App"
status_live=$?
echo status_live: ${status_live}
if [ "$status_live" -eq 0 ]; then
echo "success
else
echo "fail"
fi

Related

question on using bwait to wait for multiple bsub jobs to finish

I am new to using LSF (been using PBS/Torque all along).
I need to write code/logic to make sure all bsub jobs finish before other commands/jobs can be fired.
Here is what I have done: I have a master shell script which calls multiple other shell scripts via bsub commands. I capture the job ids from bsub in a log file and I need to ensure that all jobs get finished before the downstream shell script should execute its other commands.
Master shell script
#!/bin/bash
...Code not shown for brevity..
"Command 1 invoked with multiple bsubs" > log_cmd_1.txt
Need Code logic to use bwait before downstream Commands can be used
"Command 2 will be invoked with multiple bsubs" > log_cmd_2.txt
and so on
stdout captured from Command 1 within the Master Shell script is stored in log_cmd_1.txt which looks like this
Submitting Sample 101
Job <545> is submitted to .
Submitting Sample 102
Job <546> is submitted to .
Submitting Sample 103
Job <547> is submitted to .
Submitting Sample 104
Job <548> is submitted to .
I have used the codeblock shown below after Command 1 in the master shell script.
However, it does not seem to work for my situation. Looks like I would have gotten the whole thing wrong below.
while sleep 30m;
do
#the below gets the JobId from the log_cmd_1.txt and tries bwait
grep '^Job' <path_to>/log_cmd_1.txt | perl -pe 's!.*?<(\d+)>.*!$1!' | while read -r line; do res=$(bwait -w "done($line)");echo $res; done 1>
<path_to>/running.txt;
# the below sed command deletes lines that start with Space
sed '/^\s*$/d' running.txt > running2.txt;
# -s file check operator means "file is not zero size"
if [ -s $WORK_DIR/logs/running2.txt ]
then
echo "Jobs still running";
else
echo "Jobs complete";
break;
fi
done
The question: What's the correct way to do this using bwait within the master shell script.
Thanks in advance.
bwait will block until the condition is satisfied, so the loops are probably not neecessary. Note that since you're using done, if the job fails then bwait will exit and inform you that the condition can never be satisfied. Make sure to check that case.
What you have should work. At least the following test worked for me.
#!/bin/bash
# "Command 1 invoked with multiple bsubs" > log_cmd_1.txt
( bsub sleep 0; bsub sleep 0 ) > log_cmd_1.txt
# Need Code logic to use bwait before downstream Commands can be used
while sleep 1
do
#the below gets the JobId from the log_cmd_1.txt and tries bwait
grep '^Job' log_cmd_1.txt | perl -pe 's!.*?<(\d+)>.*!$1!' | while read -r line; do res=$(bwait -w "done($line)");echo "$res"; done 1> running.txt;
# the below sed command deletes lines that start with Space
sed '/^\s*$/d' running.txt > running2.txt;
# -s file check operator means "file is not zero size"
if [ -s running2.txt ]
then
echo "Jobs still running";
else
echo "Jobs complete";
break;
fi
done
Another way to do it. Which may is a little cleaner, is to use job arrays and job dependencies. Job arrays will combine several pieces of work that can be managed as a single job. So your
"Command 1 invoked with multiple bsubs" > log_cmd_1.txt
could be submitted as a single job array. You'll need a driver script that can launch the individual jobs. Here's an example driver script.
$ cat runbatch1.sh
#!/bin/bash
# $LSB_JOBINDEX goes from 1 to 10
if [ "$LSB_JOBINDEX" -eq 1 ]; then
# do the work for job batch 1, job 1
...
elif [ "$LSB_JOBINDEX" -eq 2 ]; then
# etc
...
fi
Then you can submit the job array like this.
bsub -J 'batch1[1-10]' sh runbatch1.sh
This command will run 10 job array elements. The driver script's environment will use the variable LSB_JOB_INDEX to let you know which element the driver is running. Since the array has a name, batch, it's easier to manage. You can submit a second job array that won't start until all elements of the first have completed successfully. The second array is submitted with this command.
bsub -w 'done(batch1)' -J 'batch2[1-10]' sh runbatch2.sh
I hope that this helps.

Output redirection to console in shell script , not reflecting realtime

I have encountered a weird problem with console output when calling a subscript from inside another script.
Below is the Main Script which is calling a TestScript.
The TestScript is an installation script written in perl which takes some time to execute and prints messages as the installation progresses.
My problem here is that the output from the called perl script is only shown on the console once the installation is completed and the script returns.
Oddly i have used this kind of syntax successfully before for calling shell scripts and it works fine for them and output is shown simultaneously without waiting for the subscript to return.
I need to capture the output of the script so that i can grep if the installation was successful.
I do not control the perl script and cannot modify it in any way.
Any help would be greatly appreciated.
Thanks in advance.
#!/bin/sh
echo " Main script"
output=`/var/tmp/Packages/TestScript.pl | tee /dev/tty`
exitCode=$?
echo $output | grep -q "Installation completed successfully"
if [ $? -eq 0 ]; then
echo "Installation was successful"
fi
echo $exitCode

How to read a line from continuously written log file

I am new bee for shell scripting so if you find this post is redundant please redirect me to existing post.
I have jar command which run in background and keep on updating log file.
I am automating this process and writing shell script. I am using below code. Please help me to figure out what am I missing.
java -jar fileName.jar &
pid=$!
echo $!
cd /usr/ebp/logs/
logs=$(ls -t | head -n 1) #trying to read latest log file from directory
tail -f ${logs} | while read LOGLINE
do
if [ "${LOGLINE}" == *"Batch process completed successfully"* ]
then
echo ${LOGLINE}
echo "csv files created successfully"
break
fi
done
kill -9 ${pid}
Once batch process is completed I want to kill jar process id. But I am not able to read log file and it is stuck in while loop.

To check whether nohup service is running or not using shell script?

I had created a nohup service using the below command in putty.
nohup php /var/www/html/XYZ/sample.php &
This command executes the sample.php file in background.
Now what i need is i want a shell script which checks whether this service is running or not.Incase if the service is not running i want that shell script to create a service by its own. Below is the code what i tried.
#!/bin/bash
email_to="xyz#gmail.com";
export DISPLAY=:0.0
PIDS=`ps -aux | grep sample.php|awk '{print $2}'`
if [ -z "$PIDS" ]; then
echo "$(date) - The service is not running. Sending email to :$email_to" >> /var/www/html/XYZ/sample.php;
echo "SERVICE is not running - $(date)" | mail -s "service is not running - $(date)" $email_to
echo "" >> /var/www/html/XYZ/sample.php;
exit 1
else
echo "$(date) - Service already running. Sending email to : $email_to" >> /var/www/html/XYZ/sample.php;
echo "SERVICE is running - $(date)" | mail -s "SERVICE is running - $(date)" $email_to
fi
when i execute the file i get the mail as service is running ,and once i kill the sample.php and when i get execute this file i get the same mail "as service is running" but its wrong ,so can anyone direct me where have i gone wrong?
where have i gone wrong?
With ps -aux | grep sample.php, the grep is finding sample.php in its own process command line grep sample.php, also output by ps. This can be avoided by modifying the grep command so that it doesn't contain sample.php literally, e. g. grep 'sample\.php' (which by the way averts the risk of matching another character instead of the .). You'll probably also need wide output from ps to not truncate the command, so change the above pipeline to ps waux | grep 'sample\.php'.

IBM i OS/400 QSH - shell script

I've got a startup/stop script provided for Tivoli Workload Scheduler in which it will start/stp[ the TWS services in IBM i.
# CHECK ROOT USER
WHO=`id | cut -f1 -d" "`
if [ "$WHO" = "uid=0(root)" ]
then
su TWSSVC -c "/etc/rc.d/init.d/tebctl-tws_cpa_agent_TWSSVC stop"
exit $?
fi
/etc/rc.d/init.d/tebctl-tws_cpa_agent_TWSSVC stop
exit $?
Problem with this is that, in OS/400 the equivalent of root is QSECOFR, so I've amended the line
if [ "$WHO" = "uid=0(root)" ]
to
if [ "$WHO" = "uid=0(QSECOFR)" ]
then i got an error on the following line:
su TWSSVC -c "/etc/rc.d/init.d/tebctl-tws_cpa_agent_TWSSVC stop"
/TWSSVC/TWS/ShutDownLwa: 001-0019 Error found searching for command su. No such path or directory.
How do I change to script such that, when it is QSECOFR, it will su into TWSSVC and trigger the start/stop script? I'm not very familiar with OS400. I'm triggering this script in qsh environment.
you can try the following;
sudo TWSSVC -c "/etc/rc.d/init.d/tebctl-tws_cpa_agent_TWSSVC stop"
Tell me how it goes.

Resources