Let's say I have a file with patterns to match into another file:
file_names.txt
pfg022G
pfg022T
pfg068T
pfg130T
pfg181G
pfg181T
pfg424G
pfg424T
I would like to use file_names.txt and use sed command into example.conf:
example.conf
{
"ExomeGermlineSingleSample.sample_and_unmapped_bams": {
"flowcell_unmapped_bams": ["/groups/cgsd/alexandre/gatk-workflows/src/ubam/pfg022G.unmapped.bam"],
"unmapped_bam_suffix": ".unmapped.bam",
"sample_name": "pfg022G",
"base_file_name": "pfg022G.GRCh38DH.target",
"final_gvcf_base_name": "pfg022G.GRCh38DH.target"
},
The sed command would replace pfg022G on example.conf with pfg022T, which is the next item in file_names.txt (sed s/pfg022G/pfg022T/). The example.conf at this point should look like this:
{
"ExomeGermlineSingleSample.sample_and_unmapped_bams": {
"flowcell_unmapped_bams": ["/groups/cgsd/alexandre/gatk-workflows/src/ubam/pfg022T.unmapped.bam"],
"unmapped_bam_suffix": ".unmapped.bam",
"sample_name": "pfg022T",
"base_file_name": "pfg022T.GRCh38DH.target",
"final_gvcf_base_name": "pfg022T.GRCh38DH.target"
},
After 15 minutes the substitution should be pfg022T to pfg068T and so on until all the items in file_names.txt are exhausted.
The following crontab would run your script every 15 minutes:
# Example of job definition:
# .---------------- minute (0 - 59)
# | .------------- hour (0 - 23)
# | | .---------- day of month (1 - 31)
# | | | .------- month (1 - 12) OR jan,feb,mar,apr ...
# | | | | .---- day of week (0 - 6) (Sunday=0 or 7)
# | | | | |
# * * * * * command to be executed
15 * * * * /path/to/script
With script reading
#!/usr/bin/env sh
file1="file_names.txt"
file2="example.conf"
sed -i -e "$(awk '(NR>1){print "s/"p"/"$1"/g"}{p=$1}' $file1 | tac)" example.conf
The trick we use here is to do revere substitution. The file example.conf always contains only one string which is also in "file_names.txt". So if you attempt to substitute from the last to the front you will only do a single substitution.
We use awk here to build a sed-script and tac to reverse it so that we only have a single match:
$ awk '(NR>1){print "s/"p"/"$1"/g"}{p=$1}' $file_names.txt
s/pfg022G/pfg022T/g
s/pfg022T/pfg068T/g
s/pfg068T/pfg130T/g
s/pfg130T/pfg181G/g
s/pfg181G/pfg181T/g
s/pfg181T/pfg424G/g
s/pfg424G/pfg424T/g
If we do a sed with the above script, we will always end up with pfg424T (the last entry) as it will find a single match (assume we are in the third entry pfg068T), so sed will perform every substitution after that. However, when we reverse the order (using tac), sed will only find a single match.
For the logic of how i think this would work,
Create a cronjob, or if your server shuts down periodically create an anacron job, to run a bash script every 15 minutes.
In the bash script you can use an if statement you can test with grep with each line in filenames.txt which line exists in example.conf, and if that line exists to go onto the next line in filenames.txt. If you are at the last string in file_names.txt then the bash script should stop running with the exit command
You would run the sed command to replace your string. I do think the replace command should be able to replace this.
If you have to reload the service to load the amended configuration and then to add this also afterwards.
It might be easier to create a daemon/background process as opposed to a periodic cron job.
while read str;
do
sleep 900;
sed -ri "s#(^\"flowcell_unmapped_bams.*gatk-workflows/src/ubam/)(.*)(\.unmapped\.bam\"\],.*$)#\1$str\3#;s/(^\"sample.name.*: \")(.*)(\",.*$)/\1$str\3/;s/(^\"base_file_name.*: \")(.*)(\.GRCh38DH.*$)/\1$str\3/" example.conf;
done < file_names.txt &
Read the contents of file_names.txt line by line via a while loop, reading the line as a variable str. Sleep 900 seconds and then use this str variable in three sed commands. In all commands, enable regular expression interpretation with -r or -E and split the lines into three sections. Substitute the lines for sections 1, followed by the variable str and section 3. Add & at the end to run the process to the background.
I would perhaps just generate all the files in advance in a queue directory, and have the cron job pick up the next one on each invocation.
awk 'NR==FNR { a[++n] = $0; next }
{ file = $1 ".conf"
for(i=1; i<=n; i++) {
l = a[i]; sub("{{name}}", $0, l);
print l >file }
close file
}' - file_names.txt <<\____
{
"ExomeGermlineSingleSample.sample_and_unmapped_bams": {
"flowcell_unmapped_bams": ["/groups/cgsd/alexandre/gatk-workflows/src/ubam/{{name}}.unmapped.bam"],
"unmapped_bam_suffix": ".unmapped.bam",
"sample_name": "{{name}}",
"base_file_name": "{{name}}.GRCh38DH.target",
"final_gvcf_base_name": "{{name}}.GRCh38DH.target"
},
____
Running this on your sample file_names.txt creates the following files:
pfg022G.conf pfg068T.conf pfg181G.conf pfg424G.conf
pfg022T.conf pfg130T.conf pfg181T.conf pfg424T.conf
with contents like you would expect; here's pfg0222G.conf:
{
"ExomeGermlineSingleSample.sample_and_unmapped_bams": {
"flowcell_unmapped_bams": ["/groups/cgsd/alexandre/gatk-workflows/src/ubam/pfg022G.unmapped.bam"],
"unmapped_bam_suffix": ".unmapped.bam",
"sample_name": "pfg022G",
"base_file_name": "pfg022G.GRCh38DH.target",
"final_gvcf_base_name": "pfg022G.GRCh38DH.target"
},
Now, your cron job just needs to move one of these to example.conf and process it. When the directory with the files is empty, you are done.
#!/bin/sh
for f in confdir/*.conf; do
if [ -e "$f" ]; then
# Safeguard against clobbering previous run
if [ -e ./example.conf ]; then
echo "$0: example.conf is still there -- skipping this run" >&2
exit 63
fi
mv "$f" ./example.conf
exec your_main_script_or_whatever
# Should never fall through to here, but whatever
break
else
echo "$0: directory empty -- aborting" >&2
fi
done
To avoid a race condition -- if the previous cron job is still running, or failed for some reason, we don't want to be clobbering its input file. This requires your_main_script_or_whatever to remove example.conf when it completes. If you don't care about this, maybe you can simply remove the safeguard condition from the above script.
Sorry for being unclear my follow mates,
So to elaborate and possibly answer my own question, while Distro1Analysis.txt is being written to, calculate output speed in kb/s and when output is done then average output speed and print to screen.
The second part, its own question really, is quite simple, I'm not a computer scientist or advanced programmer, but I am certain there's an relatively easy way to improve the overall execution speed of the script which asking what is the speed culprit, how the script was written, the chosen programs, the mix of programs (i.e., is it faster to use 3 instances of the same program as opposed to one instance of 3 different programs...) For instance, could recursive-ness be used and how?
I was orignally going to ask how to benchmark the speed of a program to run one command, but it seemed simpler to use an overarching (global) benchmark hence the question. But any help you can provide would be useful.
Rdepends Version
ps -A &>> Distro1Analysis.txt && sudo service --status-all &>> Distro1Analysis.txt && \
for z in $(dpkg -l | awk '/^[hi]i/{print $2}' | grep -v '^lib'); do \
printf "\n$z:" && \
aptitude show $z | grep -E 'Uncompressed Size' && \
result=$(apt-rdepends 2>/dev/null $z | grep -v "Depends")
final=$(apt show 2>/dev/null $result | grep -E "Package|Installed-Size" | sed "/APT/d;s/Installed-Size: //");
if [[ (${#final} -le 700) ]]; then echo $final; else :; fi done &>> Distro1Analysis.txt
Depends Version
ps -A &>> Distro1Analysis.txt && sudo service --status-all &>> Distro1Analysis.txt && \
for z in $(dpkg -l | awk '/^[hi]i/{print $2}' | grep -v '^lib'); do \
printf "\n$z:" && \
aptitude show $z | grep -E 'Uncompressed Size' && \
printf "\n" && \
apt show 2>/dev/null $(aptitude search '!~i?reverse-depends("^'$z'$")' -F "%p" | \
sed 's/:i386$//') | grep -E 'Package|Installed-Size' | sed '/APT/d;s/^.*Package:/\t&/;N;s/\n/ /'; done &>> Distro1Analysis.txt
calculate output speed in kb/s and when output is done then average
output speed and print to screen
Here's an answer that's basically
Starting your script to run in the background.
Checking the size of its output file every two seconds with du -b.
Run the following bash script like so: $ bash scriptoutmon.sh subscript.sh Distro1Analysis.txt 12 10 2
scriptoutmon.sh usage:
$1 : Path to the subscript to run
$2 : Path to output file to monitor
$3 : How long to run scriptoutmon.sh script in seconds.
$4 : How long to run the subscript ($1)
$5 : Tick length for displayed updates in seconds.
scriptoutmon.sh:
#!/bin/bash
# Date: 2020-04-13T23:03Z
# Author: Steven Baltakatei Sandoval
# License: GPLv3+ https://www.gnu.org/licenses/gpl-3.0.en.html
# Description: Runs subscript and measures change in file size of a specified file.
# Usage: scriptoutmon.sh [ path to subscript ] [ path to subscript output file ] [ script TTL (s) ] [ subscript TTL (s) ] [ tick size (s) ]
# References:
# [1]: Adrian Pronk (2013-02-22). "Floating point results in Bash integer division". https://stackoverflow.com/a/15015920
# [2]: chronitis (2012-11-15). "bc: set number of digits after decimal point". https://askubuntu.com/a/217575
# [3]: ypnos (2020-02-12). "Differences of size in du -hs and du -b". https://stackoverflow.com/a/60196741
# == Function Definitions ==
echoerr() { echo "$#" 1>&2; } # display message via stderr
getSize() { echo $(du -b "$1" | awk '{print $1}'); } # output file size in bytes. See [3].
# == Initialize settings ==
SUBSCRIPT_PATH="$1" # path to subscript to run
SUBSCRIPT_OUTPUT_PATH="$2" # path to output file generated by subscript
SCRIPT_TTL="$3" # set script time-to-live in seconds
SUBSCRIPT_TTL="$4" # set subscript time-to-live in seconds
TICK_SIZE="$5" # update tick size (in seconds)
# == Perform work ==
timeout $SUBSCRIPT_TTL bash "$SUBSCRIPT_PATH" & # run subscript for SCRIPT_TTL seconds.
# note: SUBSCRIPT_OUTPUT_PATH should be path of output file generated by subscript.sh .
if [ -f $SUBSCRIPT_OUTPUT_PATH ]; then SUBSCRIPT_OUTPUT_INITIAL_SIZE=$(getSize "$SUBSCRIPT_OUTPUT_PATH"); else SUBSCRIPT_OUTPUT_INITIAL_SIZE="0"; fi # save initial size if file exists.
echoerr "Running $(basename "$SUBSCRIPT_PATH") and then monitoring rate of file size changes to $(basename "$SUBSCRIPT_OUTPUT_PATH")." # explain displayed output
# Calc and display subscript output file size changes
while [ $SECONDS -lt $SCRIPT_TTL ]; do # loop while script age (in seconds) less than SCRIPT_TTL.
if [ $SECONDS -ge $TICK_SIZE ]; then # if after first tick
OUTPUT_PREVIOUS_SIZE="$OUTPUT_CURRENT_SIZE" ; # save size previous tick
OUTPUT_CURRENT_SIZE=$(getSize "$SUBSCRIPT_OUTPUT_PATH") ; # save size current tick
BYTES_WRITTEN=$(( $OUTPUT_CURRENT_SIZE - $OUTPUT_PREVIOUS_SIZE )) ; # calc size difference between current and previous ticks.
WRITE_SPEED_BYTES_PER_SECOND=$(($BYTES_WRITTEN / $TICK_SIZE)) ; # calc write speed in bytes per second
WRITE_SPEED_KILOBYTES_PER_SECOND=$( echo "scale=3; $WRITE_SPEED_BYTES_PER_SECOND / 1000" | bc -l ) ; # calc write speed in kilobytes per second. See [1], [2].
echo "File size change rate (KB/sec):"$WRITE_SPEED_KILOBYTES_PER_SECOND ;
else # if first tick
OUTPUT_CURRENT_SIZE=$(getSize "$SUBSCRIPT_OUTPUT_PATH") # save size current tick (initial)
fi
sleep "$TICK_SIZE"; # wait a tick
done
SUBSCRIPT_OUTPUT_FINAL_SIZE=$(getSize "$SUBSCRIPT_OUTPUT_PATH") # save final size
# == Display results ==
SUBSCRIPT_OUTPUT_TOTAL_CHANGE_BYTES=$(( $SUBSCRIPT_OUTPUT_FINAL_SIZE - $SUBSCRIPT_OUTPUT_INITIAL_SIZE )) # calc total size change in bytes
SUBSCRIPT_OUTPUT_TOTAL_CHANGE_KILOBYTES=$( echo "scale=3; $SUBSCRIPT_OUTPUT_TOTAL_CHANGE_BYTES / 1000" | bc -l ) # calc total size change in kilobytes. See [1], [2].
echoerr "$SUBSCRIPT_OUTPUT_TOTAL_CHANGE_KILOBYTES kilobytes added to $SUBSCRIPT_OUTPUT_PATH size in $SUBSCRIPT_TTL seconds."
exit 0;
You should get output like this:
baltakatei#debianwork:/tmp$ bash scriptoutmon.sh subscript.sh Distro1Analysis.txt 12 10 2
Running subscript.sh and then monitoring rate of file size changes to Distro1Analysis.txt.
File size change rate (KB/sec):6.302
File size change rate (KB/sec):.351
File size change rate (KB/sec):.376
File size change rate (KB/sec):.345
File size change rate (KB/sec):.335
15.419 kilobytes added to Distro1Analysis.txt size in 10 seconds.
baltakatei#debianwork:/tmp$
Increase $3 and $4 to monitor the script longer (perhaps to let it finish its work).
The second part, its own question really
I'd suggest making it a separate question.
I am fairly new to bash scripting... I have an issue with a cronjob where I get too many emails when "ntpq: read: Connection refused" error comes up. I want to create a conditional when this error shows up, DO NOT send the email.
However, I can't seem to parse the output from "nptq -nc peers". I did try to redirect the output of the cronjob to a test.txt file and then create another cronjob that parses that file. However, I feel like there is a better solution.
Thanks for your help!
Here is my code for the cronjob
#!/bin/bash
limit=10101010101010101010000 # Set your limit in milliseconds here
offsets=$(/usr/sbin/ntpq -nc peers | /usr/bin/tail -n +3 | awk 'BEGIN { FS = " " } ; { print $9 }' | /usr/bin/tr -d '-')
for offset in ${offsets}; do
if echo $offset $limit | awk '{exit $1>$2?0:1}'
then
echo "NTPD offset $offset > $limit. Please investigate"
exit 1
fi
done
I have a script which updates DNS records on a DNS server. Every time the named.conf file is updated with a new site I have to raise the serial counter by at least 1.
So my scripts is running on a remote machine and I'm about to add the next line:
serial=`ssh root#172.19.214.X 'cat /var/named/named.booking.zone |grep serial |awk -F\" \" '{print $1}''`
It doesn't work well, I think i'm not escaping the "" correctly...
And then I thought of something like that:
ssh root#172.19.214.X "sed -e 'g/"$serial"/"$serial"+1/s' /var/named/named.booking.zone"
My source file:
$TTL 600
# IN SOA root. booking.local. (
2013030311 ; serial (d. adams)
604800 ; Refresh
86400 ; Retry
2419200 ; Expire
604800 ) ; Minimum
;
IN MX 10 mail
IN NS dns
IN A 172.19.214.X
www IN A 172.19.214.X
Can you please show me how to do the escapes correctly?
Thanks!
Having this content for /var/named/named.booking.zone :
serial "1"
You can use something like this:
#!/usr/bin/bash
serial=$(ssh root#172.19.214.X 'grep serial /var/named/named.booking.zone' 2>/dev/null |awk '{print $1}' )
(( next_serial = serial + 1 ))
ssh root#172.19.214.X 'sed -i.bak -e 's_${serial}_${next_serial}_g' /var/named/named.booking.zone' 2>/dev/null
I'm having an issue when i try to port my bash script to nagios.The scripts works fine when I run on console, but when I run it from Nagios i get the msg "(null)" - In the nagios debug log I see that it parse the script well but it returns the error msg..
I'm not very good at scripting so i guess i'll need some help
The objective of the script is to check *.ears version from some servers, md5 them and compare the output to see if the version matches or not.
To do that, i have a json on these servers that prints the name of the *.ear and his md5.
so.. The first part of the script gets that info from the json with curl and stores just the md5 number on a .tempfile , then it compares both temp files and if they match i got the $STATE_OK msg. If they dont , it creates a .datetmp file with the date ( the objective of this is to print a message after 48hs of inconsistence). Then, i make a diff of the .datetmp file and the days i wanna check if the result is less than 48hrs it prints the $STATE_WAR, if the result is more than 48 hrs it Prints the $STATE_CRI
The sintaxis of the script is " $ sh script.sh nameoftheear.ear server1 server2 "
Thanks in advance
#/bin/bash
#Variables For Nagios
cont=$1
bas1=$2
bas2=$3
## Here you set the servers hostname
svr1= curl -s "http://$bas1.domain.com:7877/apps.json" | grep -Po '"EAR File":.*? [^\\]",' | grep $cont | awk '{ print $5 }' > .$cont-tmpsvr1
svr2= curl -s "http://$bas2.domain.com:7877/apps.json" | grep -Po '"EAR File":.*? [^\\]",' | grep $cont | awk '{ print $5 }' > .$cont-tmpsvr2
file1=.$cont-tmpsvr1
file2=.$cont-tmpsvr2
md51=$(head -n 1 .$cont-tmpsvr1)
md52=$(head -n 1 .$cont-tmpsvr2)
datenow=$(date +%s)
#Error Msg
ERR_WAR="Not updated $bas1: $cont $md51 --- $bas2: $cont $md52 "
ERR_CRI="48 hs un-updated $bas1: $cont $md51 --- $bas2: $cont $md52 "
OK_MSG="Is up to date $bas1: $cont $md51 --- $bas2: $cont $md52 "
STATE_OK=0
STATE_WARNING=1
STATE_CRITICAL=2
##Matching md5 Files
if cmp -s "$file1" "$file2"
then
echo $STATE_OK
echo $OK_MSG
# I do the rm to delete the date tmp file so i can get the $STATE_OK or $STATE_WARNING
rm .$cont-datetmp
exit 0
elif
echo $datenow >> .$cont-datetmp
#Vars to set modification date
datetmp=$(head -n 1 .$cont-datetmp)
diffdate=$(( ($datenow - $datetmp) /60 ))
#This var is to set the time of the critical ERR
days=$((48*60))
[ $diffdate -lt $days ]
then
echo $STATE_WARNING
echo $ERR_WAR
exit 1
else
echo $STATE_CRITICAL
echo $ERR_CRI
exit 2
fi
I am guessing some kind of permission problem - more specifically I don't think the nagios user can write to it's own home directory. You either fix those permissions or write to a file in /tmp (and consider using mktemp?).
...but ideally you'd skip writing all those files, as far as I can see all of those comparisons etc could be kept in memory.
UPDATE
Looked at your script again - I see some obvious errors you can look into:
You are printing out the exit value before you print the message.
You print the exit value rather than exit with the exit value.
...so this:
echo $STATE_WARNING
echo $ERR_WAR
exit 1
Should rather be:
echo $ERR_WAR
exit $STATE_WARNING
Also I am wondering if this is really the script or if you missed something when pasting. There seems to be missing an 'if' and also a superfluous line break in your last piece of code? Should rather be:
if [ $diffdate -lt $days ]
then
...
else
...
fi