Log query to get a value from a log message and get the average - grafana-loki

I have some log messages in my Loki:
2023-02-13 12:20:08.675 INFO 30937 --- [lettuce-epollEventLoop-5-1] c.g.poc.Filter.AuthenticationFilter : [ requestId : 904c1292-66, AuthFilterTime : 15 ms ]
2023-02-13 12:16:32.100 INFO 30937 --- [lettuce-epollEventLoop-5-1] c.g.poc.Filter.AuthenticationFilter : [ requestId : f84a572f-65, AuthFilterTime : 4 ms ]
2023-02-13 12:16:31.427 INFO 30937 --- [lettuce-epollEventLoop-5-1] c.g.poc.Filter.AuthenticationFilter : [ requestId : 904c1292-64, AuthFilterTime : 10 ms ]
I want to get the average value of AuthFilterTime.
I'm getting the error:
"parse error at line 5, col 6: syntax error: unexpected NUMBER"
when i run the query,
sum by (filename)(
avg_over_time(
{filename="/path/to/the/log/file"} |= "AuthFilterTime.*ms" |
regexp `AuthFilterTime\s*:\s*(\d+) ms` |
$1
)[24h]
)
Can somebody help what is it that i've been doing wrong??
I'm actually new to grafana.

try this
avg_over_time({filename="/path/to/the/log/file"} |
regexp "(?m)AuthFilterTime\s*:\s*([0-9]+) ms" |
float |
mean() [24h])
This query uses regular expressions to extract the AuthFilterTime value from each log message and then calculates the average value over a 24-hour period. The float function is used to convert the extracted value from a string to a floating-point number.

Related

How to display percentile stats per URL on the console

I'm am working on writing some performance tests using Taurus & Jmeter.
After executing a set of tests on a some URLs, I see the stats on console as below.
19:03:40 INFO: Percentiles:
+---------------+---------------+
| Percentile, % | Resp. Time, s |
+---------------+---------------+
| 95.0 | 2.731 |
+---------------+---------------+
19:03:40 INFO: Request label stats:
+--------------+--------+---------+--------+-------+
| label | status | succ | avg_rt | error |
+--------------+--------+---------+--------+-------+
| /v1/brands | OK | 100.00% | 2.730 | |
| /v1/catalogs | OK | 100.00% | 1.522 | |
+--------------+--------+---------+--------+-------+
I'm wondering if there is a way to display other labels per URL. for ex. percentile response time per URL.
Below are all the stats that could be captured from Taurus. (according to taurus documentation), but I couldn't figure out the configuration required to display them onto the console. Appreciate any help.
label - is the sample group for which this CSV line presents the stats. Empty label means total of all labels
concurrency - average number of Virtual Users
throughput - total count of all samples
succ - total count of not-failed samples
fail - total count of saved samples
avg_rt - average response time
stdev_rt - standard deviation of response time
avg_ct - average connect time if present
avg_lt - average latency if present
rc_200 - counts for specific response codes
perc_0.0 .. perc_100.0 - percentile levels for response time, 0 is also minimum response time, 100 is maximum
bytes - total download size
Looking into documentation on Taurus Console Reporter it is possible to amend only the following parameters:
modules:
console:
# disable console reporter
disable: false # default: auto
 
# configure screen type
screen: console
# valid values are:
# - console (ncurses-based dashboard, default for *nix systems)
# - gui (window-based dashboard, default for Windows, requires Tkinter)
# - dummy (text output into console for non-tty cases)
dummy-cols: 140 # width for dummy screen
dummy-rows: 35 # height for dummy screen
If you can understand and write Python code you can try amending reporting.py file which is responsible for generating stats and summary table. This is a good point to start:
def __report_summary_labels(self, cumulative):
data = [("label", "status", "succ", "avg_rt", "error")]
justify = {0: "left", 1: "center", 2: "right", 3: "right", 4: "left"}
sorted_labels = sorted(cumulative.keys())
for sample_label in sorted_labels:
if sample_label != "":
data.append(self.__get_sample_element(cumulative[sample_label], sample_label))
table = SingleTable(data) if sys.stdout.isatty() else AsciiTable(data)
table.justify_columns = justify
self.log.info("Request label stats:\n%s", table.table)
Otherwise alternatively you can use Online Interactive Reports or configure your JMeter test to use Grafana and InfluxDB as 3rd-party metrics storage and visualisation systems.

Maximum value of a column in apache pig

I am trying to find the maximum value of a column ratingTime using pig.I am running below script :
ratings = LOAD '/user/maria_dev/ml-100k/u.data' AS (userid:int,movieID:int,rating:int, ratingTime:int);
maxrating = MAX(ratings.ratingTime);
DUMP maxrating
Sample Input data is :
196 242 3 881250949
186 302 3 891717742
22 377 1 878887116
244 51 2 880606923
I am getting below error :
2018-08-05 07:02:05,247 [main] INFO org.apache.pig.backend.hadoop.PigATSClient - Created ATS Hook
2018-08-05 07:02:05,914 [main] ERROR org.apache.pig.PigServer - exception during parsing: Error during parsing. <file script.pi
You need a preceding GROUP ALL before applying MAX.Source
ratings = LOAD '/user/maria_dev/ml-100k/u.data' USING PigStorage('\t') AS (userid:int,movieID:int,rating:int, ratingTime:int);
rating_group = GROUP ratings ALL;
maxrating = FOREACH ratings_group GENERATE MAX(ratings.ratingTime);
DUMP maxrating;

Logstash GROK filter for tomcat logs

i am new to elk , i need pattern that can find " transactionid following value from tomcat logs and create separate field
sample logs are like below ...
2018-03-14 10:58:36,853 INFO so:165 - Female Value : 0.084370888769626617 for transactionId ABCsdf62969
2018-03-14 10:58:36,853 INFO so:165 - White Value : 0.90355902910232544 for transactionId ABtgF62969
2018-03-14 10:58:36,853 INFO so:165 - Black Value : 0.001742142834700644 for transactionId ZBCBfg2969
2018-03-14 10:58:36,853 INFO so:165 - Asian Value : 0.0055485325865447521 for transactionId TBCBF62969
2018-03-14 10:58:36,853 INFO so:165 - Hispanic Value : 0.079676181077957153 for transactionId L45BF62969
2018-03-14 10:58:36,853 INFO so:165 - Other Value : 0.0094741648063063622 for transactionId A56BF62969
This is not a Question/Answer webstite wherein you just ask question without even trying. Atleast try first and then come up with errors that you get.
I think you are new to this platform. I'll give you pattern this time, but try to understand it first, and then work on your own for future patterns.
For your case, this will work:
grok {
match => ["message", "%{TIMESTAMP_ISO8601:time} %{LOGLEVEL:log_level} %{GREEDYDATA:some_data} transactionId %{WORD:transaction_id}"]
}
Try and work hard. Thanks !!

Return values of specific fields of a file (bash)

I have a very long list with hits from a HMMer search in the following form:
Query: Alvin_0001|ID:9263667| [L=454]
Description: chromosomal replication initiator protein DnaA [Allochromatium vinosum DSM 180]
Scores for complete sequence (score includes all domains):
--- full sequence --- --- best 1 domain --- -#dom-
E-value score bias E-value score bias exp N Model Description
------- ------ ----- ------- ------ ----- ---- -- -------- -----------
7.5e-150 497.8 0.2 9e-150 497.5 0.2 1.0 1 COG0593
8e-11 40.6 0.5 1.5e-10 39.7 0.5 1.6 1 COG1484
4.5e-07 28.1 0.2 6e-07 27.7 0.2 1.1 1 COG1373
2.5e-05 22.3 0.1 3.4e-05 21.8 0.1 1.4 1 COG1485
Query: Alvin_0005|ID:9265207| [L=334]
Description: hypothetical protein [Allochromatium vinosum DSM 180]
Scores for complete sequence (score includes all domains):
--- full sequence --- --- best 1 domain --- -#dom-
E-value score bias E-value score bias exp N Model Description
------- ------ ----- ------- ------ ----- ---- -- -------- -----------
------ inclusion threshold ------
0.018 13.4 12.9 0.068 11.5 3.6 2.2 2 COG3247
0.024 13.1 9.0 0.053 12.0 9.0 1.5 1 COG2246
0.046 12.4 7.3 0.049 12.4 5.3 1.8 1 COG2020
Query: Alvin_0004|ID:9265206| [L=154]
Description: hypothetical protein [Allochromatium vinosum DSM 180]
Scores for complete sequence (score includes all domains):
--- full sequence --- --- best 1 domain --- -#dom-
E-value score bias E-value score bias exp N Model Description
------- ------ ----- ------- ------ ----- ---- -- -------- -----------
[No hits detected that satisfy reporting thresholds]
This file contains so much information that I am not interested in, so I need a script that only outputs certain values, that is the line with Query: and the first COG#### in the column Model
So as an expected output (tab delimited file would be the best):
Query: Alvin_0001|ID:9263667| [L=454] COG0593
Query: Alvin_0005|ID:9265207| [L=334] COG3247
Query: Alvin_0004|ID:9265206| [L=154]
note that in the last line, no COG has been found
Now the file structure is a bit too complicated for me to use a simple grep or awk command:
In the first block, the 1st and the 6st line would be the target (awk '/Query: /{nr[NR]; nr[NR+6]}; NR in nr')
In the second block, it is the 1st and the 7th line
and in the third, there is only the line with Query
So what would be now a good approach to parse this file?
Short awk solution:
awk '/^Query:/{ if(q) print q; q=$0 }q && $9~/^COG.{4}$/{ printf("%s\t%s\n",q,$9); q="" }
END{ if(q) print q }' file
The output:
Query: Alvin_0001|ID:9263667| [L=454] COG0593
Query: Alvin_0005|ID:9265207| [L=334] COG3247
Query: Alvin_0004|ID:9265206| [L=154]
Details:
/^Query:/{ q=$0 } - capturing "Query" line
q && $9~/^COG.{4}$/ - capturing the first "Model" field value (ensured by resetting the preceding "Query" line q="")
$ cat tst.awk
BEGIN { OFS="\t" }
/^Query/ { qry=$0 }
$1 ~ /^[0-9]/ { if (qry!="") print qry, $9; qry="" }
/\[No hits/ { print qry }
$ awk -f tst.awk file
Query: Alvin_0001|ID:9263667| [L=454] COG0593
Query: Alvin_0005|ID:9265207| [L=334] COG3247
Query: Alvin_0004|ID:9265206| [L=154]

How to format digits in logstash?

How can I format digit in logstash?
I am using the '' % format expression in ruby code in filter plugin but I get nil as format result. I tried sprintf and format function but same result.
Below is my code snippet.
ruby {
code => "
event.set( 'positioning', event.get('branch_lat') + ',' + event.get('branch_lon') )
event.set( 'report_datetime', event.get('report_date') + '%04d' % event.get('report_time') )
"
}
As a format result, I get below error in the log.
[2016-10-28T12:31:43,217][ERROR][logstash.filters.ruby ] Ruby exception occurred: undefined method `+' for nil:NilClass
My platform information is below.
[root#elk-analytic logstash]# rpm -qi logstash
Name : logstash
Epoch : 1
Version : 5.0.0
Release : 1
Architecture: noarch
Install Date: Thu 27 Oct 2016 01:26:03 PM JST
Group : default
Size : 198320729
License : ASL 2.0
Signature : RSA/SHA512, Wed 26 Oct 2016 01:57:59 PM JST, Key ID d27d666cd88e42b4
Source RPM : logstash-5.0.0-1.src.rpm
Build Date : Wed 26 Oct 2016 01:10:26 PM JST
Build Host : packer-virtualbox-iso-1474648640
Relocations : /
Packager : <vagrant#packer-virtualbox-iso-1474648640>
Vendor : Elasticsearch
URL : http://www.elasticsearch.org/overview/logstash/
Summary : An extensible logging pipeline
Description :
An extensible logging pipeline
Added on 2016.10.28 14:32
My Goal is to parse below csv columns into timestamp field in elasticsearch.
Please notice that hour of time has mixed patterns of 1 and 2 digits.
date,time
20160204,1000
20160204,935
I tried using date function in filter plugin but it did not work properly by logging error.
[2016-10-28T11:00:10,233][WARN ][logstash.filters.date ] Failed parsing date from field {:field=>"report_datetime",
:value=>"20160204 935", :exception=>"Cannot parse \"20160204 935\": Value 93 for hourOfDay must be in the range [0,23]", :config_parsers=>"YYYYMMdd Hmm", :config_locale=>"default=en_US"}
Below is the code snippet when above error appeared.
ruby {
code => "
event.set( 'positioning', event.get('branch_lat') + ',' + event.get('branch_lon') )
event.set( 'report_datetime', event.get('report_date') + ' ' + event.get('report_time') )
"
}
# Set the #timestamp according to report_date and time
date {
"match" => ["report_datetime", "YYYYMMdd Hmm"]
}
I did some modification and ended up with the code I first posted.
I suggest to do it like this without any ruby filter:
filter {
# your other filters...
# if 3-digit hours, pad the time with one zero
if [time] =~ /^\d{3}$/ {
mutate {
add_field => { "report_datetime" => "%{date} 0%{time}" }
}
# otherwise just concat the fields
} else {
mutate {
add_field => { "report_datetime" => "%{date} %{time}" }
}
}
# match date and time
date {
"match" => ["report_datetime", "yyyyMMdd HHmm"]
"target" => "report_datetime"
}
}

Resources