Avoid mass e-mail notification in error analysis bash script - bash

I am selecting error log details from a docker container and decide within a shell script, how and when to alert about the issue by discord and/or email.
Because I am receiving the email alerts too often with the same information in the email body, I want to implement the following two adjustments:
Fatal error log selection:
FATS="$(docker logs --since 24h $NODENAME 2>&1 | grep 'FATAL' | grep -v 'INFO')"
Email sent, in case FATS has some content:
swaks --from "$MAILFROM" --to "$MAILTO" --server "$MAILSERVER" --auth LOGIN --auth-user "$MAILUSER" --auth-password "$MAILPASS" --h-Subject "FATAL ERRORS FOUND" --body "$FATS" --silent "1"
How can I send the email only in the case, FATS has another content than the previous run of the script? I have thought about a hash about its content, which is stored and read in a text file. If the hash is the same than the previous script run, the email will be skipped.
Another option could be a local, temporary variable in the global user's bash profile, so that there is no file to be stored on the file system (to avoid read / writes).
How can I do that?

When you are writing a script for your monitoring, add functions for additional functionality, like:
logging all the alerts that have been send
make sure you don't send more than 1 alert each hour
consider sending warnings only during working hours
escalate a message when it fails N times without intermediate success
possible send an alert to different receivers (different email adresses or also to sms or teams)
make an interface for an operator so he can look back when something went wrong the first time.
When you have control which messages you send, it is easy to filter duplicate meassages (after changing --since).

I‘ve chosen the proposal of #ralf-dreager and reduced selection to 1d and 1h. Consequently, I‘ve changed my monitoring script to either go through the results of 1d or just 1h, without the need to select each time again and again. Huge performance improvement and no need to store anything else in a variable or on the file system.
FATS="$(docker logs --since 1h $NODENAME 2>&1 | grep 'FATAL' | grep -v 'INFO')"

Related

Filter Column in CSV and get the unique value

I am having three columns in a CSV: Client Name, save Set Name and Status. For some clients, we have two Status as Failed and Success both. So, I want to filter those clients only which have status as only Failed. Clients who are having two entries such as Failed and success also, I want to omit.
When I am using the listed command, it's giving me values whose status was successful also might be later on. I want values which are only Failed. Not successful even once
cat "$pwd"/Daily-Failed.csv|egrep -i 'failed|Interrupted'|awk -F',' '{print $2,$3,$9}'|sort -u > "$pwd"/Final-Failed/Failed.csv
(edit) Or with newlines:
cat "$pwd"/Daily-Failed.csv|
egrep -i 'failed|Interrupted'|
awk -F',' '{print $2,$3,$9}'|
sort -u > "$pwd"/Final-Failed/Failed.csv
enter image description here
Please find the input and desired output. Input Client Name, Save Set, Status
Star,D:/,Failed
Star,C:/,Failed
Moon,C:/,Failed
Galaxy,D:/,Failed
Sun,D:/,Failed
Star,C:/,Success
Sun,D:/,Success
Output "Client Name","Save Set",Status
Galaxy,D:/,Failed
Moon,C:/,Failed
Star,D:/,Failed
I want to filter those clients only which have status as only Failed. Clients who are having two entries such as Failed and success also, I want to omit.
I'm going to assume, looking at your sample input (Which really needs to be text in your question, not an image), that both the Client Name and Save Set columns matter - you have (Star, C:/) with both success and failure rows, and (Star, D:/) with just failure, and the latter shows up in your output, and that's the only way that would make sense given your stated goal. On the other hand you also have two (Sun, D:/) rows, one success, one failure, and that shows up in your output even though it doesn't meet your criteria any way you look at it...
Anyways, this sort of grouping and filtering of tabular data screams database, and I like to script sqlite to make it do all the work in such cases:
#!/bin/sh
filename=Daily-Failed.csv
sqlite3 -batch -csv -header <<EOF
.import '${filename}' tbl
SELECT *
FROM tbl
GROUP BY "Client Name", "Save Set"
HAVING count(*) = 1 AND Status = 'Failed'
EOF
after taking the data in your image and turning it into a CSV file Daily-Failed.csv looking like
Client Name,Save Set,Status
Star,D:/,Failed
Star,C:/,Failed
Moon,C:/,Failed
Galaxy,D:/,Failed
Sun,D:/,Failed
Star,C:/,Success
Sun,D:/,Success
that script will output
"Client Name","Save Set",Status
Galaxy,D:/,Failed
Moon,C:/,Failed
Star,D:/,Failed

Candump Filter is occasionally not working correctly

For a bash script, where I read information from a Micoboard via can, I use the candump command with a filter to read a specific message.
My problem is that while the filter itself is working correctly, the candump command with the filter occasionally does not record the specific message on the Can Bus.
I have already verified that the expected message is being sent, by displaying all can canmessages with candump without the filter.
The code of the bash script to receive the specific can message is displayed here :
CAN_PORT="can4"
CAN_ID_GET_VERSION=01500000
CAN_ID_SET_VERSION=01230000
candump -L ${CAN_PORT},${CAN_ID_SET_VERSION}:1ffffff | tee temp_candump.log &
candumpid=$!
cansend ${CAN_PORT} ${CAN_ID_GET_VERSION}#
sleep 0.5 # wait for an answer from microboard
kill $candumpid
cat temp_candump.log
This code gives me the expected can message about 9 out of 10 times.
My question is if there is a problem in the code or has someone else experienced a similar problem and found a solution ?
Any answer would be appreciated.
With kind regards

How to verify AB responses?

Is there a way to make sure that AB gets proper responses from server? For example:
To force it to output the response of a single request to STDOUT OR
To ask it to check that some text fragment is included into the response body
I want to make sure that authentication worked properly and i am measuring response time of the target page, not the login form.
Currently I just replace ab -n 100 -c 1 -C "$MY_COOKIE" $MY_REQUEST with curl -b "$MY_COOKIE" $MY_REQUEST | lynx -stdin .
If it's not possible, is there an alternative more comprehensive tool that can do that?
You can use the -v option as listed in the man doc:
-v verbosity
Set verbosity level - 4 and above prints information on headers, 3 and above prints response codes (404, 200, etc.), 2 and above prints warnings and info.
https://httpd.apache.org/docs/2.4/programs/ab.html
So it would be:
ab -n 100 -c 1 -C "$MY_COOKIE" -v 4 $MY_REQUEST
This will spit out the response headers and HTML content. The 3 value will be enough to check for a redirect header.
I didn't try piping it to Lynx but grep worked fine.
Apache Benchmark is good for a cursory glance at your system but is not very sophisticated. I am currently attempting to tune a web service and am finding that AB does not measure complete response time when considering the transfer of the body. Also as you mention you can not verify what is returned.
My current recommendation is Apache JMeter. http://jmeter.apache.org/
I am having much better success with it. You may find the Response Assertion useful for your situation. http://jmeter.apache.org/usermanual/component_reference.html#Response_Assertion

mail sent by bash shell, but not received

Mail sent by bash with return code 0, but not received (I checked the
target mail box for a few hours).
The command shell is:
echo "BodyOfMail" | mutt -a 1.png -s "AttachSuccess" -- lichunyu#xiaomi.com
Nevertheless, when I change the target email to xxx#qq.com, the mail
can be received. i.e. :
echo "BodyOfMail" | mutt -a 1.png -s "AttachSuccess" -- coxfilur_2005#qq.com
The mail can be received.
I ever suspected that the png attachment may be the cause, so I removed it,
but still, I get the same result(xxx#xiaomi.com fail, xxx#qq.com OK).
In both of the cases, I tested the returned code by $?, and
both 0(which means success certainly).
Where is the problem, How can I solve it?
If there is something wrong the domain xiaomi.com, How do I know it?

Can you view historic logs for parse.com cloud code?

On the Parse.com cloud-code console, I can see logs, but they only go back maybe 100-200 lines. Is there a way to see or download older logs?
I've searched their website & googled, and don't see anything.
Using the parse command-line tool, you can retrieve an arbitrary number of log lines:
Usage:
parse logs [flags]
Aliases:
logs, log
Flags:
-f, --follow=false: Emulates tail -f and streams new messages from the server
-l, --level="INFO": The log level to restrict to. Can be 'INFO' or 'ERROR'.
-n, --num=10: The number of the messages to display
Not sure if there is a limit, but I've been able to fetch 5000 lines of log with this command:
parse logs prod -n 5000
To add on to Pascal Bourque's answer, you may also wish to filter the logs by a given range of dates. To achieve this, I used the following:
parse logs -n 5000 | sed -n '/2016-01-10/, /2016-01-15/p' > filteredLog.txt
This will get up to 5000 logs, use the sed command to keep all of the logs which are between 2016-01-10 and 2016-01-15, and store the results in filteredLog.txt.

Resources