How to get access_log summary by goaccess starting from certain date? - bash

Currently I keep 6 weeks of apache access_log. If I generate a access summary at month end:
cat /var/log/httpd/access_log* | goaccess --output-format=csv
the summary will include some access data from previous month.
How can I skip logs of previous month and summarise from first day of month?
p.s. the data-format is: %d/%b/%Y

You can trade the Useless Use of cat for a useful grep.
grep -n $(date +'[0-3][0-9]/%b/%Y') /var/log/httpd/access_log* |
goaccess --output-format=csv
If the logs are by date, it would be a lot more economical to skip the logs which you know are too old or too new, i.e. modify the wildcard argument so you only match the files you really want (or run something like find -mtime -30 to at least narrow the set to a few files).
(The cat is useless because, if goaccess is at all correctly written, it should be able to handle
goaccess --output-format=csv /var/log/httpd/access_log*
just fine.)

Related

FreeBSD script to show active connections and append number remote file

I am using NetScaler FreeBSD, which recognizes many of the UNIX like commands, grep, awk, crontab… etc.
I run the following command to get the number of connected users that we have on the system
#> nsconmsg -g aaa_cur_ica_conn -d stats
OUTPUT (numbered lines):
Line1: Displaying current counter value information
Line2: NetScaler V20 Performance Data
Line3: NetScaler NS11.1: Build 63.9.nc, Date: Oct 11 2019, 06:17:35
Line4:
Line5: reltime:mili second between two records Sun Jun 28 23:12:15 2020
Line6: Index reltime counter-value symbol-name&device-no
Line7: 1 2675410 605 aaa_cur_ica_conn
…
…
From above output - I only need the number of connected users (represented in Line 7, 3rd column (605 to be precise), along with the Hostname and Time (of the running script)
Now, to extract this important 3rd column number i.e. 605, along with the hostname, and time of data collected - I wrote the following script:
printf '%s - %s - %s\n' "$(hostname)" "$(date '+%H:%M')" "$(nsconmsg -g aaa_cur_ica_conn -d stats | grep aaa_cur_ica_conn | awk '{print $3}')"
The result is perfect, showing hostname, time, and the number of connected users as follows:
Hostname - 09:00 – 605
Now can anyone please shed light on how I can:
Run this script every day - 5am to 5pm (12hours)?
Each time scripts runs - append a file on a remote Unix share with the output?
I appreciate this might be a bit if a challenge... however would be grateful for any bash scripts wizards out there that can create magic!
Thanks in advance!
I would suggest a quick look into the FreeBSD Handbook or For People New to Both FreeBSD and UNIX® so that you could get familiar with the operating system and tools that could help you achieve better what you want.
For example, there is a utility/command named cron
The software utility cron is a time-based job scheduler in Unix-like computer operating systems.
For example, to run something all days between 5am to 5pm every minute, you could use something like:
* 05-17 * * * command
Try more options here: https://crontab.guru/#*_05-17_*_*_*.
There are more tools for scheduling commands, for example at (https://en.wikipedia.org/wiki/At_(command)) but this something you need to evaluate and read more about it.
Now regarding the command, you are using to get the "number of connected users", you could avoid the grep and just used awk for example:
awk '/aaa_cur_ica_conn/ {print $3}'
This will print only column 3 if line contains aaa_cur_ica_conn, but as before I invite you to read more about the topic so that you could bet a better overview and better understand the commands.
Last but not least, check this link How do I ask a good question? the better you could format, and elaborate your question the easy for others to give an answer.

Getting log entry "disk online" from system log

When a disk inserted to my cluster, i wanna know that.
So i need to listen /var/adm/messages and when i catch !NEW! "online" line i must write it to a different log file.
When disk goes online I get this kind of log entries:
Dec 8 10:10:46 SMNODE01 genunix: [ID 408114 kern.info] /scsi_vhci/disk#g5000c50095f92a8f (sd69) online
Tail works without -F option. But i need -F option :/
tail messages | grep 408114 | grep '/scsi_vhci/disk#'| egrep -wi --color 'online'
I have 3 uniform words for grep.
1- The id "408114" is unique for online status.
2- /scsi_vhci/disk#
3- online
P.S: Sorry for my english :)
For grep AND use .*:
$ grep 408114.*/scsi_vhci/disk#.*online test
Dec 8 10:10:46 SMNODE01 genunix: [ID 408114 kern.info] /scsi_vhci/disk#g5000c50095f92a8f (sd69) online
Next time don't edit the question completely but ask another question.

Is there a bash expansion for syslog.1 syslog when syslog.1 may not exist?

I'd like to monitor syslog events every hour. I use dategrep to get the last hour but on log rotation the last hour may span to the previous syslog.
Is there an expansion to achieve listing the two recent syslog files in ascending order?
$(ls -tr syslog* | tail -n 2)
The output should be
syslog.1 syslog # when syslog.1 exists
or
syslog # when it doesn't
I've tried syslog{.1,} but it always outputs syslog.1.
Thank you!

How to resume reading a file?

I'm trying to find the best and most efficient way to resume reading a file from a given point.
The given file is being written frequently (this is a log file).
This file is rotated on a daily basis.
In the log file I'm looking for a pattern 'slow transaction'. End of such lines have a number into parentheses. I want to have the sum of the numbers.
Example of log line:
Jun 24 2015 10:00:00 slow transaction (5)
Jun 24 2015 10:00:06 slow transaction (1)
This is easy part that I could do with awk command to get total of 6 with above example.
Now my challenge is that I want to get the values from this file on a regular basis. I've an external system that polls a custom OID using SNMP. When hitting this OID the Linux host runs a couple of basic commands.
I want this SNMP polling event to get the number of events since the last polling only. I don't want to have the total every time, just the total of the newly added lines.
Just to mention that only bash can be used, or basic commands such as awk sed tail etc. No perl or advanced programming language.
I hope my description will be clear enough. Apologizes if this is duplicate. I did some researches before posting but did not find something that precisely correspond to my need.
Thank you for any assistance
In addition to the methods in the comment link, you can also simply use dd and stat to read the logfile size, save it and sleep 300 then check the logfile size again. If the filesize has changed, then skip over the old information with dd and read the new information only.
Note: you can add a test to handle the case where the logfile is deleted and then restarted with 0 size (e.g. if $((newsize < size)) then read all.
Here is a short example with 5 minute intervals:
#!/bin/bash
lfn=${1:-/path/to/logfile}
size=$(stat -c "%s" "$lfn") ## save original log size
while :; do
newsize=$(stat -c "%s" "$lfn") ## get new log size
if ((size != newsize)); then ## if change, use new info
## use dd to skip over existing text to new text
newtext=$(dd if="$lfn" bs="$size" skip=1 2>/dev/null)
## process newtext however you need
printf "\nnewtext:\n\n%s\n" "$newtext"
size=$((newsize)); ## update size to newsize
fi
sleep 300
done

How to fix the error in the bash shell script?

I am trying a code in shell script. while I am trying to convert the code from batch script to shell script I am getting an error.
BATCH FILE CODE
:: Create a file with all latest snapshots
FOR /F "tokens=5" %%a in (' ec2-describe-snapshots ^|find "SNAPSHOT" ^|sort /+64') do set "var=%%a"
set "latestdate=%var:~0,10%"
call ec2-describe-snapshots |find "SNAPSHOT"|sort /+64 |find "%latestdate%">"%EC2_HOME%\Working\SnapshotsLatest_%date-today%.txt"
CODE IN SHELL SCRIPT
#Create a file with all latest snapshots
FOR snapshot_date in $(' ec2-describe-snapshots | grep -i "SNAPSHOT" |sort /+64') do set "var=$snapshot_date"
set "latestdate=$var:~0,10"
ec2-describe-snapshots |grep -i "SNAPSHOT" |sort /+64 | grep "$latestdate">"$EC2_HOME%/SnapshotsLatest_$today_date"
I want to sort the snapshots according to dates and to save the snapshots that are created in latest date in a file.
SAMPLE OUTPUT OF ece-describe-snapshots:
SNAPSHOT snap-5e20 vol-f660 completed 2013-12-10T08:00:30+0000 100% 109030037527 10 2013-12-10: Daily Backup for i-2111 (VolID:vol-f9a0 InstID:i-2601)
It will contain records like this
I got this code :
latestdate=$(ec2-describe-snapshots | grep ^SNAPSHOT | sort -k 5 | awk '{print $5}')
ec2-describe-snapshots | grep SNAPSHOT.*$latestdate | > "$EC2_HOME/SnapshotsLatest_$today_date"
but getting this error :
grep: 2013-12-10T09:55:34+0000: No such file or directory
grep: 2013-12-11T04:16:49+0000: No such file or directory
grep: 2013-12-11T04:17:57+0000: No such file or directory
i have some snapshots made on amazon, i want to find the latest snapshots made on a date and then want to store them in a file. like date 2013-12-10 snapshots made on this date should be stored in file. Contents of snapshotslatest file should be
SNAPSHOT snap-c17f3 vol-f69a0 completed 2013-12-04T09:24:50+0000 100% 109030037‌​527 10 2013-12-04: Daily Backup for Sanjay_Test_Machine (VolID:vol-f66409a0 InstID:i-26048111)
SNAPSHOT snap-c7d617f9 vol-3d335f6b completed 2013-12-04T09:24:54+0000 100% 1090‌​30037527 10 2013-12-04: Daily Backup for sacht_VPC (VolID:vol-3db InstID:i-ed6)
please not that if there are snapshots created on 2013-12-10, 2013-12-11, 2013-12-12. It means that the latest_date should be 2013-12-12 and all the snaphshot created on 2013-12-12 should be saved in file.
Any suggestion or lead is appreciated.
Neither the batch script nor the shell script you posted are a good starting point so let's start from scratch. Sorry, this is too big for a comment.
You want to find the latest snapshots made on a date and then want to store them in a file.
What does that mean?
Do the snapshot files have a timestamp in their name or in their content?
If not - UNIX does not store file creation timestamps so is a last-modified timestamp adequate?
Do you literally want to concatenate all of your snapshot files into one singe file or do you want to create a file that has a list of the snapshot file names?
Post some sample input (e.g. some snapshot file names and contents if that's where the timestamp is stored) and the expected output given that input.
Update your question to address all of the above, do not try to reply in a comment.
Minor issue, you don't need a pipe when re-directing output, so your line to save should be
ec2-describe-snapshots | grep SNAPSHOT.*$latestdate > "$EC2_HOME/SnapshotsLatest_$today_date"
Now the main issue here, is that the grep is messed up. I haven't worked with amazon snapshots, but judging by your example descriptions, you should be doing something like
latestdate=$(ec2-describe-snapshots | grep -oP "\d+-\d+-\d+" | sort -r | head -1)
This will get all the dates containing the form dddd-dd-dd from the file (I'm assuming the two dates in each snapshot line always match up), sort them in reverse order (latest first) and take the head which is the latest date, storing it in $latestdate.
Then to store all snapshots with the given date do something like
ec2-describe-snapshots | grep -oP "SNAPSHOT(.*?)$lastdateT(.*?)\)" > "$EC2_HOME/SnapshotsLatest_$today_date"
This will get all text starting with SNAPSHOT, containing the given date, and ending in a closing ")" and save it. Note, you may have to mess around with it a bit, if ")" can be present elsewhere.

Resources