I am running a weather forecast model and I want to automate the modek runs via crontab. When I submit my job interactive through:
qsub -I -l nodes=8:ppn=8 -l walltime=2:00:00
The job runs ok.
When I submit through crontab, the job disappears without any trace, no error file any where, no output file, except mail which reads as:
From: root#master.cluster (Cron Daemon)
To: test#master.cluster
Subject: Cron <test#master> PATH=/opt/torque/bin:/usr/bin:/bin:. qsub /home/test
/WRF/SCRIPTS/wrf_00_run.sh
Content-Type: text/plain; charset=UTF-8
Auto-Submitted: auto-generated
X-Cron-Env: <SHELL=/bin/sh>
X-Cron-Env: <HOME=/home/test>
X-Cron-Env: <PATH=/usr/bin:/bin>
X-Cron-Env: <LOGNAME=test>
X-Cron-Env: <USER=test>
Message-Id: <20141220080001.D0F6B5C08ED#master.cluster>
Date: Sat, 20 Dec 2014 10:00:01 +0200 (CAT)
152.master.cluster
Here is my crontab entry
50 09 * * * PATH=/opt/torque/bin:/usr/bin:/bin:. qsub /home/test/WRF/SCRIPTS/wrf_submit_00.sh
and wrf_submit_00.sh has two dependent jobs to run consecutively
#!/bin/bash
cd WRF/WRFV3/run
FIRST =`qsub wrf_initialise.sh`
echo $FIRST
SECOND = `qsub -w depend=afterok:$FIRST wrf_00_run.sh`
echo $SECOND
exit 0
and the first job is
#!/bin/bash
#PBS -l nodes=8:ppn=8
cd WRF/WRFV3/run
echo -n "this script is running on: "
hostname -f
date
echo ""
echo "### PBS_NODEFILE (${PBS_NODEFILE}) ###"
cat ${PBS_NODEFILE}
echo ""
mpirun ./real.exe
exit 0
When I submit the two jobs separately through crontab I get the same disappearance.
Please help!, am stuck. Google search of a similar problem gives me no clue.
Related
when I run script in mac os x like this:
*/1 * * * * /Users/dolphin/Library/"Mobile Documents"/com~apple~CloudDocs/Document/source/dolphin/dolphin-scripts/bash/cron/latex_compile_alive_monitor.sh >> /Users/dolphin/shell.log
the output is:
From dolphin#dolphins-MacBook-Pro.local Tue Jan 21 20:47:01 2020
Return-Path: <dolphin#dolphins-MacBook-Pro.local>
X-Original-To: dolphin
Delivered-To: dolphin#dolphins-MacBook-Pro.local
Received: by dolphins-MacBook-Pro.local (Postfix, from userid 501)
id 7430417C4C19; Tue, 21 Jan 2020 20:47:00 +0800 (CST)
From: dolphin#dolphins-MacBook-Pro.local (Cron Daemon)
To: dolphin#dolphins-MacBook-Pro.local
Subject: Cron <dolphin#dolphins-MacBook-Pro> /Users/dolphin/Library/"Mobile Documents"/com~apple~CloudDocs/Document/source/dolphin/dolphin-scripts/bash/cron/latex_compile_alive_monitor.sh >> /Users/dolphin/shell.log
X-Cron-Env: <SHELL=/bin/sh>
X-Cron-Env: <PATH=/usr/bin:/bin>
X-Cron-Env: <LOGNAME=dolphin>
X-Cron-Env: <USER=dolphin>
X-Cron-Env: <HOME=/Users/dolphin>
Message-Id: <20200121124701.7430417C4C19#dolphins-MacBook-Pro.local>
Date: Tue, 21 Jan 2020 20:47:00 +0800 (CST)
+ BOOK_PATH='/Users/dolphin/Library/Mobile Documents/com~apple~CloudDocs/Document/source/dolphin/summary/'
+ COMMAND='/Library/TeX/texbin/latexmk -pdfxe -pvc -xelatex -interaction=nonstopmode '
+ PROCESS_NAME_KEYWORDS_MAP=(["dolphin-book-2020.tex"]="${COMMAND} ./dolphin-book-2020/dolphin-book-2020.tex" ["the-book-of-mine.tex"]="${COMMAND} ./the-books-of-mine/the-book-of-mine.tex" ["kubelet-learn.tex"]="${COMMAND} ./kubelet-learn/kubelet-learn.tex")
/Users/dolphin/Library/Mobile Documents/com~apple~CloudDocs/Document/source/dolphin/dolphin-scripts/bash/cron/latex_compile_alive_monitor.sh: line 15: dolphin: unbound variable
which variable was unboud? COMMAND? I am already defined in script.this is my script:
#!/usr/bin/env bash
# 当使用未初始化的变量时,程序自动退出
set -u
# 当任何一行命令执行失败时,自动退出脚本
set -e
# 在运行结果之前,先输出执行的那一行命令
set -x
BOOK_PATH="/Users/dolphin/Library/Mobile Documents/com~apple~CloudDocs/Document/source/dolphin/summary/"
COMMAND="/Library/TeX/texbin/latexmk -pdfxe -pvc -xelatex -interaction=nonstopmode "
declare -A PROCESS_NAME_KEYWORDS_MAP=(
["dolphin-book-2020.tex"]="${COMMAND} ./dolphin-book-2020/dolphin-book-2020.tex"
["the-book-of-mine.tex"]="${COMMAND} ./the-books-of-mine/the-book-of-mine.tex"
["kubelet-learn.tex"]="${COMMAND} ./kubelet-learn/kubelet-learn.tex"
)
cd "${BOOK_PATH}"
for key in ${!PROCESS_NAME_KEYWORDS_MAP[#]}
do
PID_COUNT=`ps -ef | grep "${key}" | grep -v "grep" | wc -l`
if [[ ${PID_COUNT} -lt 1 ]]; then
nohup `${PROCESS_NAME_KEYWORDS_MAP[${key}]}` &
else
echo "process already exists..."
fi
done
maybe remove the space in the end of COMMAND
I got the following error from executing a line in my Vixie Cron file:
/bin/sh: 1: cd: can't cd to $HOME/prg/gridtrader
Which to me means that $HOME is not being expanded even though crontab(5) says it will be and the log from execution shows HOME as being defined.
The cron file looks like so:
GT_DIR = $HOME/prg/gridtrader
PYTHON_EXE = $HOME/install/python37/bin/python3
APPLICATION = gridtrader/gridtrader.py
ACCOUNT = metaperl_yobit
# GRID TRADER
*/5 * * * * cd $GT_DIR && $PYTHON_EXE $APPLICATION --monitor $ACCOUNT
The log message I got is:
X-Cron-Env: <ST=/home/schemelab/prg/surgetrader/src>
X-Cron-Env: <INVOKE=/home/schemelab/install/miniconda3/bin/invoke>
X-Cron-Env: <I=iniset_1>
X-Cron-Env: <AGT=~/prg/adsactly-gridtrader>
X-Cron-Env: <GT_DIR=$HOME/prg/gridtrader>
X-Cron-Env: <PYTHON_EXE=$HOME/install/python37/bin/python3>
X-Cron-Env: <APPLICATION=gridtrader/gridtrader.py>
X-Cron-Env: <ACCOUNT=metaperl_yobit>
X-Cron-Env: <SHELL=/bin/sh>
X-Cron-Env: <HOME=/home/schemelab>
X-Cron-Env: <PATH=/usr/bin:/bin>
X-Cron-Env: <LOGNAME=schemelab>
Date: Tue, 26 Feb 2019 11:25:01 -0500 (EST)
/bin/sh: 1: cd: can't cd to $HOME/prg/gridtrader
As you can see the, from the shell, there is no problem executing this cd command:
schemelab#metta:~$ ls /home/schemelab/prg/gridtrader
LICENSE SPEC.md accounts doc log requirements.txt system.ini tmp venv
README.md StatusReport.docx batch gridtrader persistence src system.ini.sample tradelog.docx
schemelab#metta:~$ cd /home/schemelab/prg/gridtrader
schemelab#metta:~/prg/gridtrader$ ls -ld ~/prg/gridtrader
drwxr-xr-x 13 schemelab schemelab 4096 Feb 18 13:56 /home/schemelab/prg/gridtrader
schemelab#metta:~/prg/gridtrader$
You have run into one of cron's most common problems, trying to use it like an arbitrary shell script. Cron is not a shell script and you can't do everything you can do in one, like dereferencing variables or setting arbitrary new variables.
I suggest you replace your values into the cron line and avoid usage of variables:
# GRID TRADER
*/5 * * * * cd $HOME/prg/gridtrader && $HOME/install/python37/bin/python3 gridtrader/gridtrader.py --monitor metaperl_yobit
Another approach to the problem of unexpanded cron variable content is eval:
*/5 * * * * eval cd $GT_DIR && eval $PYTHON_EXE $APPLICATION --monitor $ACCOUNT
I want to add
0 5 1 * * goaccess -f /var/log/nginx/access.log -a > /home/xan/reports/report-week-$(date +%Y.%m.%d).html
but crontab always complains about that:
Subject: Cron <root#deimos> goaccess -f /var/log/nginx/access.log -a > /home/xan/reports/report-week-$(date +
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-Cron-Env: <SHELL=/bin/sh>
X-Cron-Env: <HOME=/root>
X-Cron-Env: <PATH=/usr/bin:/bin>
X-Cron-Env: <LOGNAME=root>
Message-Id: <E1bIogT-0001FX-9n#deimos>
Date: Fri, 01 Jul 2016 05:00:01 +0200
/bin/sh: 1: Syntax error: end of file unexpected (expecting ")")
What the proper syntax to do that?
% has special meaning in a crontab (it represents a newline), so you need to escape it to specify a literal percent sign.
0 5 1 * * goaccess ... > /home/xan/reports/report-week-$(date +\%Y.\%m.\%d).html
In command below I enable file /dev/tcp/10.10.10.1/80 both for reading and writing and associate it with file descriptor 3:
$ time exec 3<>/dev/tcp/10.10.10.1/80
bash: connect: Operation timed out
bash: /dev/tcp/10.10.10.1/80: Operation timed out
real 1m15.151s
user 0m0.000s
sys 0m0.000s
This automatically tries to perform TCP three-way handshake. If 10.10.10.1 is not reachable as in example above, then connect system call tries to connect for 75 seconds. Is this 75 second timeout determined by bash? Or is this system default? Last but not least, is there a way to decrease this timeout value?
It's not possible in Bash without modifying the source as already mentioned, although here is the workaround by using timeout command, e.g.:
$ timeout 1 bash -c "</dev/tcp/stackoverflow.com/80" && echo Port open. || echo Port closed.
Port open.
$ timeout 1 bash -c "</dev/tcp/stackoverflow.com/81" && echo Port open. || echo Port closed.
Port closed.
Using this syntax, the timeout command will kill the process after the given time.
See: timeout --help for more options.
It is determined by TCP. It can be decreased on a per-socket basis by application code.
NB The timeout only takes effect if there is no response at all. If there is a connection refusal, the error occurs immediately.
No: there is no way of changing timeout by using /dev/tcp/
Yes, you could change default timeout for TCP connection in any programming language.
But, bash is not a programming language!
You could have a look into source code (see: Bash Homepage), you may find lib/sh/netopen.c file where you could read in _netopen4 function:
s = socket(AF_INET, (typ == 't') ? SOCK_STREAM : SOCK_DGRAM, 0);
You could read this file carefully, there are no consideration of connection timeout.
Without patching bash sources, there is no way of changing connection timeout by a bash script.
Simple HTTP client using netcat (near pure bash)
There is a little sample HTTP client written in pure bash, but using netcat:
#!/bin/bash
tmpfile=$(mktemp -p $HOME .netbash-XXXXXX)
exec 7> >(nc -w 3 -q 0 stackoverflow.com 80 >$tmpfile)
exec 6<$tmpfile
rm $tmpfile
printf >&7 "GET %s HTTP/1.0\r\nHost: stackoverflow.com\r\n\r\n" \
/questions/24317341/how-to-decrease-tcp-connect-system-call-timeout
timeout=100;
while ! read -t .001 -u 6 status ; do read -t .001 foo;done
echo STATUS: $status
[ "$status" ] && [ -z "${status//HTTP*200 OK*}" ] || exit 1
echo HEADER:
while read -u 6 -a head && [ "${head//$'\r'}" ]; do
printf "%-20s : %s\n" ${head%:} "${head[*]:1}"
done
echo TITLE:
sed '/<title>/s/<[^>]*>//gp;d' <&6
exec 7>&-
exec 6<&-
This could render:
STATUS: HTTP/1.1 200 OK
HEADER:
Cache-Control : private
Content-Type : text/html; charset=utf-8
X-Frame-Options : SAMEORIGIN
X-Request-Guid : 46d55dc9-f7fe-425f-a560-fc49d885a5e5
Content-Length : 91642
Accept-Ranges : bytes
Date : Wed, 19 Oct 2016 13:24:35 GMT
Via : 1.1 varnish
Age : 0
Connection : close
X-Served-By : cache-fra1243-FRA
X-Cache : MISS
X-Cache-Hits : 0
X-Timer : S1476883475.343528,VS0,VE100
X-DNS-Prefetch-Control : off
Set-Cookie : prov=ff1129e3-7de5-9375-58ee-5f739eb73449; domain=.stackoverflow.com; expires=Fri, 01-Jan-2055 00:00:00 GMT; path=/; HttpOnly
TITLE:
bash - How to decrease TCP connect() system call timeout? - Stack Overflow
Some explanations:
We create first a temporary file (under private directory for security reason), bind and delete before using them.
$ tmpfile=$(mktemp -p $HOME .netbash-XXXXXX)
$ exec 7> >(nc -w 3 -q 0 stackoverflow.com 80 >$tmpfile)
$ exec 6<$tmpfile
$ rm $tmpfile
$ ls $tmpfile
ls: cannot access /home/user/.netbash-rKvpZW: No such file or directory
$ ls -l /proc/self/fd
lrwx------ 1 user user 64 Oct 19 15:20 0 -> /dev/pts/1
lrwx------ 1 user user 64 Oct 19 15:20 1 -> /dev/pts/1
lrwx------ 1 user user 64 Oct 19 15:20 2 -> /dev/pts/1
lr-x------ 1 user user 64 Oct 19 15:20 3 -> /proc/30237/fd
lr-x------ 1 user user 64 Oct 19 15:20 6 -> /home/user/.netbash-rKvpZW (deleted)
l-wx------ 1 user user 64 Oct 19 15:20 7 -> pipe:[2097453]
$ echo GET / HTTP/1.0$'\r\n\r' >&7
$ read -u 6 foo
$ echo $foo
HTTP/1.1 500 Domain Not Found
$ exec 7>&-
$ exec 6>&-
Herer is my CRONTAB file (Ubuntu 10.10):
57 1 * * 2-6 ET=`date --date 'yesterday'+%Y%m%d`;echo $ET
Even The syntax color indicate that something is wrong. and there is this error:
Subject: Cron <root> ET=`date --date 'yesterday' + (failed)
Content-Type: text/plain; charset=ANSI_X3.4-1968
X-Cron-Env: <SHELL=/bin/sh>
X-Cron-Env: <HOME=/root>
X-Cron-Env: <PATH=/usr/bin:/bin>
X-Cron-Env: <LOGNAME=root>
/bin/sh: Syntax error: EOF in backquote substitution
But I am not sure whats wrong. Thanks a lot!
Cron needs to escape the % sign - http://www.hcidata.info/crontab.htm
Try it with a backslash:
57 1 * * 2-6 ET=`date --date 'yesterday' +\%Y\%m\%d`;echo $ET