Troubleshooting PM2 Restart in Cron - bash

Update: I was not able to get this working as I desired. I ended up using the pm2 --cron flag since cron couldn't find my pm2 processes.
I am using pm2 restart in a bash script and it always fails on the pm2 restart projectName commands when run by cron (as root) but works when run manually with sudo. I am not sure how to troubleshoot as the pm2 log files don't show anything obvious. It looks like someone else has had the same issue so perhaps it's just a bug? Has anybody else found a way to resolve this issue?
Thank you!
Edit: Adding some context for #kevinnls
Here's the .sh script; I've isolated the command so I can test cron
# Vars
NOW=$(date +"%Y-%m-%d")
LOGS_PATH="/scriptLoc/logs"
LOG_FILE="$LOGS_PATH/$NOW-log.txt"
BE_PATH="/beLoc"
# Start log file
date >> $LOG_FILE
echo "Running..." >> $LOG_FILE
# Temp
cd $BE_PATH
pm2 restart be >> $LOG_FILE
pm2 restart be || echo "ERROR: BE pm2 restart: $?" >> $LOG_FILE
echo "Done." >> $LOG_FILE
exit
If I run the command with sudo ./script.sh it works
If I run it with cron I see the following output in my log file:
Fri Mar 26 17:35:01 UTC 2021
Running...
Use --update-env to update environment variables
ERROR: BE pm2 restart: 1
Done.
If I view pm2 logs:
I see it exit with code 0 and restart when I run the script manually.
I see no output for the failed restart from cron

The error you got:
[PM2][ERROR] Process or Namespace be not found
What is "be"? seems like pm2 from inside of cron has no idea what to make of it.
If it is a file, when you ran the script in your shell, was the file be in your working directory?
You can make these changes to log your error messages to $LOG_FILE.
This should help you get more details.
#redirect output *and* errors to log
pm2 restart be &>> $LOG_FILE
exit_code=$?
# if previous exit code != 0 do everything after `&&'
[[ $exit_code -ne 0 ]] && echo "ERROR: BE pm2 restart: $exit_code" >> $LOG_FILE
Then try running your script again and share the errors encountered.

I was not able to resolve this in the manner I desired. Ultimately I used the -c (cron) flag with pm2 itself.
pm2 cli reference
gitlab issue

Related

Bash script runs but no output on main commands and not executed

I'm setting a cron job that is a bash script containing the below:
#!/bin/bash
NUM_CONTAINERS=$(docker ps -q | wc -l)
if [ $NUM_CONTAINERS -lt 40 ]
then
echo "Time: $(date). Restart containers."
cd /opt
pwd
sudo docker kill $(docker ps -q)
docker-compose up -d
echo "Completed."
else
echo Nothing to do
fi
The output is appended to a log file:
>> cron.log
However the output in the cron file only shows:
Time: Sun Aug 15 10:50:01 UTC 2021. Restart containers.
/opt
Completed.
Both command do not seem to execute as I don't see any change in my containers either.
These 2 non working commands work well in a standalone .sh script without condition though.
What am I doing wrong?
User running the cron has sudo privileges, and we can see the second echo printing.
Lots of times, things that work outside of cron don't work within cron because the environment is not set up in the same way.
You should generally capture standard output and error, to see if something going wrong.
For example, use >> cron.log 2>&1 in your crontab file, this will capture both.
There's at least the possibility that docker is not in your path or, even if it is, the docker commands are not working for some other reason (that you're not seeing since you only capture standard output).
Capturing standard error should help out with that, if it is indeed the issue.
As an aside, I tend to use full path names inside cron scripts, or set up very limited environments at the start to ensure everything works correctly (once I've established why it's not working correctly).

Detect if docker ran successfully within the same script

My script.sh:
#/bin/sh
docker run --name foo
(Just assume that the docker command works and the container name is foo. Can't make the actual command public.)
I have a script that runs a docker container. I want to check that it ran successfully and echo the successful running status on the terminal.
How can I accomplish this using the container name? I know that I have to use something like docker inspect but when I try to add that command, it only gets executed after I ^C my script probably because docker has the execution.
In this answer, the docker is executed in some other script so it doesn't really work for my use case.
The linked answer from Jules Olléon works on permanently running services like webservers, application servers, database and similar software. In your example, it seems that you want to run a container on-demand, which is designed to do some work and then exit. Here, the status doesn't help.
When running the container in foreground mode as your example shows, it forwards the applications return code to the calling shell. Since you didn't post any code, I give you a simple example: We create a rc.sh script returning 1 as exit-code (which normally indicates some failure):
#!/bin/sh
echo "Testscript failed, returning exitcode 1"
exit 1
It got copied and executed in this Dockerfile:
FROM alpine:3.7
COPY rc.sh .
ENTRYPOINT [ "sh", "rc.sh" ]
Now we build this image using docker build -t rc-test . and execute a short living container:
$ docker run --rm rc-test
Testscript failed, returning exitcode 1
Bash give us the return code in $?:
$ echo $?
1
So we see that the container failed and could simply check them e.g. inside some bash script with an if-condition to perform some action when it fails:
#!/bin/bash
if ! docker run --rm rc-test; then
echo "Docker container failed with rc $?"
fi
After running your docker run command you can check this way if your docker container is still up and running:
s='^foo$'
status=$(docker ps -qf "name=$s" --format='{{.Status}}')
[[ -n $status ]] && echo "Running: $status" || echo "not running"
You just need to execute it with "-d" to execute the container in detached mode. With this, the solutions provided in the other post or the solution provided by #anubhava are both good solutions.
docker run -d -name some_name mycontainer
s='^some_name$'
status=$(docker ps -qf "name=$s" --format='{{.Status}}')
[[ -n $status ]] && echo "Running: $status" || echo "not running"

bash -x /var/lib/cloud/instance/user-data.txt runs but user-data from terraform gives error

I am trying to run a script using terraform. The content of the user-data is as follows:
.
.
cat <<EOH | java -jar ./jenkins-cli.jar -s $JENKINS_URL -auth admin:$PASSWORD create-credentials-by-xml system::system::jenkins _
<com.cloudbees.jenkins.plugins.sshcredentials.impl.BasicSSHUserPrivateKey plugin="ssh-credentials#1.16">
<scope>GLOBAL</scope>
<id>$CRED_ID</id>
<description>$SLAVE_IP pem file</description>
<username>ec2-user</username>
<privateKeySource class="com.cloudbees.jenkins.plugins.sshcredentials.impl.BasicSSHUserPrivateKey\$DirectEntryPrivateKeySource">
<privateKey>${worker_pem}</privateKey>
</privateKeySource>
</com.cloudbees.jenkins.plugins.sshcredentials.impl.BasicSSHUserPrivateKey>
EOH
.
.
When it executes as part of user-data it gives out an error as No such command create-credentials-by-xml.
But when I log in to the instance and execute bash -x /var/lib/cloud/instance/user-data.txt it runs as expected.
Can anyone please tell what is the reason for it and how to fix it? Thanks!
I have tried #cloudhook and separating the lines as well, but didn't work.
Answering my question:
Where was the problem:
The issue was not with bash but with jenkins-cli.jar itself.
The error message showed No such command create-credentials-by-xml which made me think, it was bash error but in reality, it was jar file error all along.
Reason:
The reason it was failing at user-data execution and not at bash execution was that it was unable to load configuration of plugins in a short amount of time.
Solution:
From the given reason, it is obvious that it needs time so I gave it sleep 25 to confirm it works or not and yes it does work but it was not an ideal solution.
Optimized Solution:
To make it better I listed the plugins before executing any jar commands and if the list came empty re run the commnd
# Creating CMD utility for jenkins-cli commands
jenkins_cmd="java -jar /opt/jenkins-cli.jar -s $JENKINS_URL -auth admin:$PASSWORD"
# Waiting for Jenkins to load all plugins
while (( 1 )); do
count=$($jenkins_cmd list-plugins 2>/dev/null | wc -l)
ret=$?
echo "count [$count] ret [$ret]"
if (( $count > 0 )); then
break
fi
sleep 30
done

executing command on vagrant-mounted

I'm trying to run a command after a share is mounted with vagrant. bu I've never written an upstart script before. What I have so far is
start on vagrant-mounted
script
if [ "$MOUNTPOINT" = "/vagrant" ]
then
env CMD="echo $MOUNTPOINT mounted at $(date)"
elif [ "$MOUNTPOINT" = "/srv/website" ]
then
env CMD ="echo execute command"
fi
end script
exec "$CMD >> /var/log/vup.log"
of course that's not the actual script I want to run but I haven't gotten that far yet but the structure is what I need. My starting point has been this article. I've had a different version that was simply
echo $MOUNTPOINT mounted at `date`>> /var/log/vup.log
that version did write to the log.
Trying to use init-checkconf faile with failed to ask Upstart to check conf file

Script and Scriptreplay give me some problems

I want to replicate what I do on my console.
I'm using this command:
roberto#rcisla-pc:~/Desktop$ script -t 2> timing.log -a output.session
and then I execute some commands but at the moment of replication this occurs:
roberto#rcisla-pc:~/Desktop$ scriptreplay time.log record.session
scriptreplay: unexpected end of file on record.session
And record.session is empty. I don't know what's wrong!!!
I'm on ubuntu 13.04,
thanks !!
Well, just I missed the exit command.
This is the complete sequence:
1- To start the recording
roberto#rcisla-pc:~/Desktop$ script -t 2> timing.log -a output.session
2- To stop the recording execute exit
roberto#rcisla-pc:~/Desktop$ exit
3- and then I execute the next command to see the complete recording:
roberto#rcisla-pc:~/Desktop$ scriptreplay timing.log output.session

Resources