Define a specific log format in .goaccessrc - goaccess

I'm reading the goaccess man page but I'm missing simple examples. I have a customised nginx with the following config:
log_format timed_combined '$remote_addr - $remote_user [$time_local] '
'$ssl_protocol/$ssl_cipher '
'"$request" $status $body_bytes_sent '
'"$http_referer" "$http_user_agent" '
'$request_time $upstream_response_time $pipe';
Here is an example log entry:
66.249.76.120 - - [20/Dec/2016:19:04:03 +0100]
TLSv1.2/ECDHE-RSA-AES128-GCM-SHA256 "GET / HTTP/1.1" 200 27232
"-" "Mozilla/5.0 (compatible; Googlebot/2.1;
+http://www.google.com/bot.html)" 0.026 0.026 .
How do I have to configure the .goaccessrc to read that format?

You can add this to your config file or ~/.goaccessrc
log-format %h %^[%d:%t %^] %^"%r" %s %b "%R" "%u" %T %^
date-format %d/%b/%Y
time-format %H:%M:%S

Related

Nginx cache lock always causes 500ms delay

Nginx cache locked requests(http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_cache_lock) always takes 500ms to respond
$ab -n 2 -c 2 http://192.168.12.103/test1234
access.log:
192.168.12.103 - - [12/Sep/2017:02:34:59 -0700] "GET /test1234 HTTP/1.0" 200 12 "-" "ApacheBench/2.3" "-" 127.0.0.1:9095 0.002 0.002 MISS
192.168.12.103 - - [12/Sep/2017:02:34:59 -0700] "GET /test1234 HTTP/1.0" 200 12 "-" "ApacheBench/2.3" "-" - - 0.502 HIT
I know that it buffers to a temp file and copies it to the cache. But 500ms looks large. Anybody knows why?
Any help would be appreciated.
Setup info:
Test upstream server
cache is located in tmpfs (mount -t tmpfs -o size=512m tmpfs /tmpfs)
nginx version: 1.7.2.1
nginx config
worker_processes 1;
error_log logs/error.log;
daemon off;
pid /var/run/nginx.pid;
events {}
http {
proxy_cache_path /tmpfs/local_cache keys_zone=local_cache:250m levels=1:2 inactive=8s max_size=1G;
proxy_temp_path /tmpfs;
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for" '
'$upstream_addr $upstream_response_time $request_time $upstream_cache_status';
access_log logs/access.log main;
server {
listen 80;
server_name localhost;
location / {
proxy_pass http://127.0.0.1:9095;
proxy_cache local_cache;
proxy_cache_valid 200 2s;
proxy_cache_lock on;
}
}
}
This seems to be the default behaviour. The cache locked requests are locked by worst case 500ms or the time_out value.(source)
If the upstream response times are stable, for cache_lock_timeout any value little above the response time, it will avoid the behaviour. In this particular case we can set about 5ms and the first request will return in 0-1 ms and the locked requests will return within 5-6ms.

Apache piped logs, errors and succeeds at the same time

I have this log format for my Apache server
LogFormat "%h %l %u %t \"%r\" status:%>s %O \"%{Referer}i\" \"%{User-Agent}i\" %v %h %D %A %>s %T" combined
I am trying to filter out these logs based on the value of %T more than 3, so that I can send them to Loggly. I have a huge volume of logs to manage, and cannot send all of them directly to Loggly as it is an overkill for the plan I am on.
I ended up writing this CustomLog directive in my VirtualHost
CustomLog "|/bin/bash -c '{ read ENTRY <&0; if [ ${ENTRY##* } -gt 3 ]; then echo $ENTRY | sed -r \'s/status:(\d*)//\' >> /var/log/apache2/slow.log; fi; }'" combined
This appends the correct log entries to the slow.log file but also gives me an error message
AH00106: piped log program '/bin/bash -c '{ read ENTRY <&0; if [ ${ENTRY##* } -gt 3 ]; then echo $ENTRY | sed -r \\'s/status:(\\d*)//\\' >> /var/log/apache2/slow.log; fi; }'' failed unexpectedly
This is the output in slow.log if it helps
XXX.YYY.ZZZ.WWW - - [06/Oct/2016:10:06:08 +0000] "GET / HTTP/1.1" 200 4575 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:49.0) Gecko/20100101 Firefox/49.0" YYY.ZZZ.WWW.XXX ZZZ.WWW.XXX.YYY 4032879 ZZZ.WWW.XXX.YYY 200 4
Note I had to use status:%>s to filter out for other logs. I also tried without sed but the error persists.

GoAccess custom forwarded log parsing

I am currently using goaccess-1.0.2. I have installed it on an Amazon Linux box. The box which it resides has customized logs that were forwarded from an Apache WebApp Server. What I have tried to accomplish but can't seem to figure out is how to get GoAccess to parse our customized log.
Here is an example of the custom forwarded WebApp Log entry:
Jun 24 00:00:41 directory1 httpd-access: 55.117.170.95 www.URLaddress.com - [24/Jun/2016:00:00:41 -0700] "GET /sites/all/themes/somthing_on_demand/js/fancybox/jquery.fancybox-1.3.4.css HTTP/1.1" 304 - "ht
tps://www.IPaddress.com/my_account/yum" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36" "SESSb9948a0b21e4d377a7d82f6adbf86c91=l
on7pgjlikml7q4tq954ejiao1; cookie_js=1; __utma=23285183.1119616966.1452095139.1468883973.1468963151.39; __utmb=23285183.500.10.1468963151; __utmc=23285183; __utmz=23285183.1468963151.39.39.utmcsr=fyi.URLaddress.com|utm
ccn=(r/INFOSEC-MAXLEN-256" "-" 57630
Here are a few log-formats I have tried:
log-format %^ %^ %^ "%h %^ %u %t \"%r\" %>s %b \"%R\" \"%u\" \"%^\" \"%^\" %D"
log-format "%h %{Host}i %{SSL_CLIENT_S_DN_CN}x %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" \"%{SHORT_COOKIE}e\" \"%{X-Forwarded-For}i\" %D"
log-format "%h %{Host}i %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" \"%{SHORT_COOKIE}e\" \"%{X-Forwarded-For}i\" %D"
I thought I would ignore the date and time format using %^ then use date format %m %d and time format %T .
I am very new at this and could really use help. Thank you for your feedback in advance.
Please try this, it works for me:
goaccess -f access.log --log-format='%^:%^:%^: %h %v %^[%d:%t %^] "%r" %s %b "%R" "%u" "%^" "%^" %D' --date-format='%d/%b/%Y' --time-format='%T'

Read in and parse textfile

I need to parse the file (txt) and display 10 lines of queries by the number of bytes. (sort) I have a file log.txt:
164.94.76.83.cust.bluewin.ch - - [17/Oct/2006:07:56:45 -0700] "GET /example/serif.css HTTP/1.1" 200 4824 "http://www.example.org/example/When/200x/2003/07/25/NotGaming" "Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.8.0.7) Gecko/20060909 Firefox/1.5.0.7"
164.94.76.83.cust.bluewin.ch - - [03/Oct/2006:07:56:45 -0700] "GET /example/example.js HTTP/1.1" 200 6685 "http://www.example.org/example/When/200x/2003/07/25/NotGaming" "Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.8.0.7) Gecko/20060909 Firefox/1.5.0.7"
164.94.76.83.cust.bluewin.ch - - [06/Oct/2006:07:56:46 -0700] "GET /example/When/200x/2003/07/25/Nuke.png HTTP/1.1" 200 19757 "http://www.example.org/example/When/200x/2003/07/25/NotGaming" "Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.8.0.7) Gecko/20060909 Firefox/1.5.0.7"
164.94.76.83.cust.bluewin.ch - - [15/Oct/2006:07:56:46 -0700] "GET /example/When/200x/2003/07/25/diablo.png HTTP/1.1" 200 12597 "http://www.example.org/example/When/200x/2003/07/25/NotGaming" "Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.8.0.7) Gecko/20060909 Firefox/1.5.0.7"
164.94.76.83.cust.bluewin.ch - - [19/Oct/2006:07:56:46 -0700] "GET /example/When/200x/2003/07/25/-big/Nuke.jpg HTTP/1.1" 403 322 "
Output must be (with count in % and links - and sort DESC):
1. http://www.example.org/example/When/200x/2006/09/25/ - 3100 - 74%
2. http://www.example.org/example/ - 1000 - 24%
3. http://www.example.org/example/genx/docs/Guide.html - 91 - 2%
That is, it is necessary to highlight the line for the maximum number of bytes in the request sort and indicate the amount of interest.
Since you insist on a shell-only approach, the closest solution to what you ask seems to be something like this:
sort -t ' ' -k 10 -r -n log.txt | head -n 10 | awk '{print $1 $7, $10}'
You can probably do (much) better by either setting a more useful LogFormat when logging the requests, or by allowing a Perl or Python parser when processing the logs.

Timing an http server response using bash simple tools

Is that written on the title
I need it to measure a server load time, and in case this value is higher than a threshold, I restart the web server automatically.
How to time an http server response using simple GNU bash?
You could script and action the output of ab or apache benchmark. Also ensure you have %D enabled as a logformat. So rather than scripting a test you could script to tail log files and if time taken above threshold to restart.
Here is a script:
#!/bin/bash
# alert threshold - amount of times to go over limit before capturing it as an issue;
ALERT_THRESHOLD=3
# alert time in seconds
# so this is the time it takes to load the page anything exceeding set seconds
ALERT_LIMIT=60;
ALERT_LIMIT_MILI=$(echo $ALERT_LIMIT|awk '{$3=$1*1000; print $3}')
TAIL_LIMIT=10;
LOG_FILE="/var/log/apache2/access.log"
RESULT=$(tail -n $TAIL_LIMIT $LOG_FILE|awk -v alimit=$ALERT_LIMIT_MILI -v athreshold=$ALERT_THRESHOLD 'BEGIN{QUERY=""; i=0; SENDALERT=0} {
if ($1 > alimit) {
i++; QUERY=QUERY" TIME_TAKEN:"($1/1000)"seconds,"$1"ms|DATE:"$5"|STATUS:"$10"|URL:"$12"\n";
if (i >= athreshold){
SENDALERT++;
};
}
} END { print "QUERY:"QUERY"\nSENDALERT:"SENDALERT; }')
SENDALERT=$(echo -e $RESULT|awk -F"SENDALERT:" '{print $2}')
echo $SENDALERT
if [[ $SENDALERT >=1 ]]; then
echo "restaring apache"
content=$(echo -e $RESULT|awk -F"QUERY:" '{print $2}')
(for lines in $(echo $content); do echo $lines; done;)
#(for lines in $(echo $content); do echo $lines; done;)| mail -s "REstarting apache $(date) " root#localhost
fi
The alert time in seconds was set to 0 for my tests, you will see alert level 8,there are 10 lines that have these time values, so once variable i hits limit which is 3, it starts to inrement sendalert variable, this is why it reports it as 8, since the first two were passed as part of threshold.
running it:
./script.sh
ALERT LEVEL: 8
restaring apache
TIME_TAKEN:0.108seconds,108ms|DATE:[07/Mar/2013:22:12:51|STATUS:304|URL:"http://localhost/"
TIME_TAKEN:0.299seconds,299ms|DATE:[07/Mar/2013:22:12:51|STATUS:304|URL:"http://localhost/"
TIME_TAKEN:3.432seconds,3432ms|DATE:[07/Mar/2013:22:12:58|STATUS:200|URL:"-"
TIME_TAKEN:0.217seconds,217ms|DATE:[07/Mar/2013:22:12:58|STATUS:304|URL:"http://localhost/"
TIME_TAKEN:0.117seconds,117ms|DATE:[07/Mar/2013:22:12:58|STATUS:304|URL:"http://localhost/"
TIME_TAKEN:0.101seconds,101ms|DATE:[07/Mar/2013:22:12:58|STATUS:304|URL:"http://localhost/"
TIME_TAKEN:3.255seconds,3255ms|DATE:[07/Mar/2013:22:13:03|STATUS:200|URL:"-"
TIME_TAKEN:0.351seconds,351ms|DATE:[07/Mar/2013:22:13:03|STATUS:304|URL:"http://localhost/"
TIME_TAKEN:0.242seconds,242ms|DATE:[07/Mar/2013:22:13:03|STATUS:304|URL:"http://localhost/"
TIME_TAKEN:0.112seconds,112ms|DATE:[07/Mar/2013:22:13:03|STATUS:304|URL:"http://localhost/"
SENDALERT:8
---- apache access log:
108 127.0.0.1 - - [07/Mar/2013:22:12:51 +0000] "GET /icons/folder.gif HTTP/1.1" 304 186 "http://localhost/" "Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:19.0) Gecko/20100101 Firefox/19.0"
299 127.0.0.1 - - [07/Mar/2013:22:12:51 +0000] "GET /icons/compressed.gif HTTP/1.1" 304 188 "http://localhost/" "Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:19.0) Gecko/20100101 Firefox/19.0"
3432 127.0.0.1 - - [07/Mar/2013:22:12:58 +0000] "GET / HTTP/1.1" 200 783 "-" "Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:19.0) Gecko/20100101 Firefox/19.0"
217 127.0.0.1 - - [07/Mar/2013:22:12:58 +0000] "GET /icons/blank.gif HTTP/1.1" 304 186 "http://localhost/" "Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:19.0) Gecko/20100101 Firefox/19.0"
117 127.0.0.1 - - [07/Mar/2013:22:12:58 +0000] "GET /icons/folder.gif HTTP/1.1" 304 186 "http://localhost/" "Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:19.0) Gecko/20100101 Firefox/19.0"
101 127.0.0.1 - - [07/Mar/2013:22:12:58 +0000] "GET /icons/compressed.gif HTTP/1.1" 304 187 "http://localhost/" "Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:19.0) Gecko/20100101 Firefox/19.0"
3255 127.0.0.1 - - [07/Mar/2013:22:13:03 +0000] "GET / HTTP/1.1" 200 782 "-" "Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:19.0) Gecko/20100101 Firefox/19.0"
351 127.0.0.1 - - [07/Mar/2013:22:13:03 +0000] "GET /icons/folder.gif HTTP/1.1" 304 187 "http://localhost/" "Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:19.0) Gecko/20100101 Firefox/19.0"
242 127.0.0.1 - - [07/Mar/2013:22:13:03 +0000] "GET /icons/compressed.gif HTTP/1.1" 304 188 "http://localhost/" "Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:19.0) Gecko/20100101 Firefox/19.0"
112 127.0.0.1 - - [07/Mar/2013:22:13:03 +0000] "GET /icons/blank.gif HTTP/1.1" 304 186 "http://localhost/" "Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:19.0) Gecko/20100101 Firefox/19.0"
where I have put %D as first column output and in the awk statement in the script I am comparing $1's value against limit.. The rest of the $10's etc are according to where things appear in my log..
You could then put it in some script folder, remove verbosity or pump outut to dev null and run it as part of cron every 10 minutes or something
enjoy
This is a one-liner that solves the issue
(time wget -p --no-cache --delete-after www.example.com -q) 2>&1 >/dev/null | grep real | awk -F"[m\t]" '{ printf "%s\n", $2*60+$3 }'
it returns the load time of a page load in seconds and fraction of a second using a dot separator
time(5)
wget
awk

Resources