Kibana count unique records based on log value - elasticsearch

I want to create a Kibana metric for the unique users visiting my site.
I have an index collecting logs from a service in format
<date> <user1#gmail.com> - <log message> <client>
and I want to count unique user emails ignoring the rest of the fields.
Is it possible to do such a regex via some of the aggregations? Currently I was able to find only unique count based on some specific field which is not an option for me.

You can create a separate field first:
Either by using kibana scripted fields.
Or by using logstash mutate filter plugin.
And then you can apply terms aggregation on data table visualization to achieve this.

Related

Grafana Variables Query using ElasticSearch - Filter doesn't work

I am currently using Grafana v9.1.7 and ElasticSearch 8.4.2
What I'm trying to achieve is to create a dashboard that can filter the data by country. I have a keyword field named honeypot_country (it's a string that mapped into keyword in elastic). When this filter is selected, it should only provide set of data filtered by that country
I already tried to create a variable query to filter these data. But it doesn't filter as I want to. So, I hope anyone can help me with this issue. Thanks

Use #timestamp as metric in an elastic dashboard

Problem
I am trying to build a dashboard in elastic with a table to monitor job runs.
I want to have per run the minimum timestamp (ie. job start) and the number of processed messages. The minimum timestamp is my problem, I can't seem to get it.
What I have done
All my log lines have as (relevant) fields: #timestamp, nb_messages, run_id. run_id is unique per run, and a run creates multiple log lines.
I create a dashboard, add a TSVB panel, and select Table.
I use run_id as the field to group by.
I can use max(nb_message) in my table without issue.
But if I use min(#timestamp), or any other aggregation than count, I just get a -.
I first tried with a lens instead of a TSVB panel, and I had the same issue, but with as message: To use this function, select a different field.
I can confirm in the index that logging.timestamp has date for type.
Question
Is there a way to use the timestamp as metric?
I would use a "normal" data table visualization (navigate through Aggregation based option in the Visualization menu if you're using the latest version of Kibana) instead of the TSVB. There, the default metric is count representing the amount of events of the index pattern in the selected time range. You can use the min metric on the #timestamp field and aggregate/group your data as you want.
The preliminary is of course that the selected index pattern contains an #timestamp field.
I hope I could help you.

elasticsearch - Tag data with lookup table values

I’m trying to tag my data according to a lookup table.
The lookup table has these fields:
• Key- represent the field name in the data I want to tag.
In the real data the field is a subfield of “Headers” field..
An example for the “Key” field:
“Server. (* is a wildcard)
• Value- represent the wanted value of the mentioned field above.
The value in the lookup table is only a part of a string in the real data value.
An example for the “Value” field:
“Avtech”.
• Vendor- the value I want to add to the real data if a combination of field- value is found in an document.
An example for combination in the real data:
“Headers.Server : Linux/2.x UPnP/1.0 Avtech/1.0”
A match with that document in the look up table will be:
Key= Server (with wildcard on both sides).
Value= Avtech(with wildcard on both sides)
Vendor= Avtech
So baisically I’ll need to add a field to that document with the value- “ Avtech”.
the subfields in “Headers” are dynamic fields that changes from document to document.
of a match is not found I’ll need to add to the tag field with value- “Unknown”.
I’ve tried to use the enrich processor , use the lookup table as the source data , the match field will be ”Value” and the enrich field will be “Vendor”.
In the enrich processor I didn’t know how to call to the field since it’s dynamic and I wanted to search if the value is anywhere in the “Headers” subfields.
Also, I don’t think that there will be a match between the “Value” in the lookup table and the value of the Headers subfield, since “Value” field in the lookup table is a substring with wildcards on both sides.
I can use some help to accomplish what I’m trying to do.. and how to search with wildcards inside an enrich processor.
or if you have other idea besides the enrich processor- such as parent- child and lookup terms mechanism.
Thanks!
Adi.
There are two ways to accomplish this:
Using the combination of Logstash & Elasticsearch
Using the only the Elastichsearch Ingest node
Constriant: You need to know the position of the Vendor term occuring in the Header field.
Approach 1
If so then you can use the GROK filter to extract the term. And based on the term found, do a lookup to get the corresponding value.
Reference
https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html
https://www.elastic.co/guide/en/logstash/current/plugins-filters-kv.html
https://www.elastic.co/guide/en/logstash/current/plugins-filters-jdbc_static.html
https://www.elastic.co/guide/en/logstash/current/plugins-filters-jdbc_streaming.html
Approach 2
Create an index consisting of KV pairs. In the ingest node, create a pipeline which consists of Grok processor and then Enrich it. The Grok would work the same way mentioned in the Approach 1. And you seem to have already got the Enrich part working.
Reference
https://www.elastic.co/guide/en/elasticsearch/reference/current/grok-processor.html
If you are able to isolate the sub field within the Header where the Term of interest is present then it would make things easier for you.

How to map between logstash extracted fields

I am extracting info from logfiles, but I want to map them together for aggregations, here's a sample logfile:
2017-01-01 07:53:44 [monitor_utils.py] INFO: Crawled iteration for merchant ariika started
2017-01-01 07:53:44 [utils.py] INFO: UpdateCrawlIteration._start_crawl_iteration function took 0.127 s
2017-01-01 07:57:22 [statscollectors.py] INFO: Dumping Scrapy stats:
{'item_scraped_count': 22,
'invalid_items_count': 84}
I am extracting the merchant name from the first line ariika and items_scraped_counts, invalid_items_count from the last two lines, I have different logfiles for each merchant, and I want to know items scraped count per logfile for each merchant using Kibana.
How to filter between one merchant and another in my case?
if I understand you in a correct way, I believe the source field in kibana can help you.
The source field indicates the name of log file.
using the bar char, you can choose the metric that you want based in your tems scraped count field, and then create a bucket aggregation using the source field. And filter based in merchant field in the kibana search bar.
Or on top of the first aggregation you can create another aggregation using the merchant field and choose the split bars. without using the kibana search bar.

kibana unique count seeing names with - as different entery

I have an problem with the unique count feature.
I get data from elasticsearch for example an computer name (PC-01) in a field.
When i want to use a visualisation unique count then kibana makes from "DESKTOP-2D562R2" -> "DESKTOP" and "2D562R2" as a entery.
See this splitted field:
The data kibana gets from elastic search looks like this entery data:
The problem with this is that 2d562r2 and desktop two different "enterys" are in a kibana table or with unique count.
Your field is being analyzed (split into tokens). Change the mapping (or template, depending on how you're creating the indexes) to make this field not_analyzed.
Note that, as a hack, logstash's default template creates a ".raw" version of string fields that is not analyzed. You could refer to enterys.raw.

Resources