Logstash - Add fields from the log - Grok - elasticsearch

I'm learning logstash and I'm using Kibana to see the logs. I would like to know if is there anyway to add fields using data from message property.
For example, the log is like this:
#timestamp:December 21st 2016, 21:39:12.444 port:47,144
appid:%{[path]} host:172.18.0.5 levell:level message:
{"#timestamp":"2016-12-22T00:39:12.438+00:00","#version":1,"message":"Hello","logger_name":"com.empresa.miAlquiler.controllers.UserController","thread_name":"http-nio-7777-exec-1","level":"INFO","level_value":20000,
"HOSTNAME":"6f92ae402cb4","X-Span-Export":"false","X-B3-SpanId":"8f548829e9d18a8a","X-B3-TraceId":"8f548829e9d18a8a"}
My logstash conf is like:
filter {
grok {
match => {
"message" =>
"^%{TIMESTAMP_ISO8601:timestamp}\s+%{LOGLEVEL:level}\s+%{NUMBER:pid}\s+---\s+\[\s*%{USERNAME:thread}\s*\]\s+%{JAVAFILE:class}\s*:\s*%{DATA:themessage}(?:\n+(?<stacktrace>(?:.|\r|\n)+))?$"
}
}
date {
match => [ "timestamp" , "yyyy-MM-dd HH:mm:ss.SSS" ]
}
mutate {
remove_field => ["#version"]
add_field => {
"appid" => "%{[path]}"
}
add_field => {
"levell" => "level"
}
}
}
I would like to take level (in the log is INFO), and message (in the log is Hello) and add them as fields.
Is there anyway to do that?

What if you do something like this using mutate:
filter {
mutate {
add_field => ["newfield", "%{appid} %{levell}"] <-- this should concat both your appid and level to a new field
}
}
You might have a look at this thread.

Related

grok not parsing logs

Log Sample
[2020-01-09 04:45:56] VERBOSE[20735][C-0000ccf3] pbx.c: Executing [9081228577525#from-internal:9] Macro("PJSIP/3512-00010e39", "dialout-trunk,1,081228577525,,off") in new stack
I'm trying to parse some logs,
I have tested some logs I have made on and it returning the result I need. But when I combining it with my config and run it, the logs not parsed into the index.
here is my config:
input{
beats{
port=>5044
}
}
filter
{
if [type]=="asterisk_debug"
{
if [message] =~ /^\[/
{
grok
{
match =>
{
"message" => "\[%{TIMESTAMP_ISO8601:log_timestamp}\] +(?<log_level>(?i)(?:debug|notice|warning|error|verbose|dtmf|fax|security)(?-i))\[%{INT:thread_id}\](?:\[%{DATA:call_thread_id}\])? %{DATA:module_name}\: %{GREEDYDATA:log_message}"
}
add_field => [ "received_timestamp", "%{#timestamp}"]
add_field => [ "process_name", "asterisk"]
}
if ![log_message]
{
mutate
{
add_field => {"log_message" => ""}
}
}
if [log_message] =~ /^Executing/ and [module_name] == "pbx.c"
{
grok
{
match =>
{
"log_message" => "Executing +\[%{DATA:TARGET}#%{DATA:dialplan_context}:%{INT:dialplan_priority}\] +%{DATA:asterisk_app}\(\"%{DATA:protocol}/%{DATA:Ext}-%{DATA:Channel}\",+ \"%{DATA:procedure},%{INT:trunk},%{DATA:dest},,%{DATA:mode}\"\) %{GREEDYDATA:log_message}"
}
}
}
}
}
}
output{
elasticsearch{
hosts=>"127.0.0.1:9200"
index=>"new_asterisk"
}
}
when I check it into kibana index, the index just showing raw logs.
Questions:
why my conf not parsing logs even the grok I've made successfully tested (by me).
solved
log not get into if condition
It seems like your grok-actions don't get applied at all because the data get indexed raw and no error-tags are thrown. Obviously your documents don't contain a field type with value asterisk_debug which is your condition to execute the grok-actions.
To verify this, you could implement a simple else-path that adds a field or tag indicating that the condition was not met like so:
filter{
if [type]=="asterisk_debug"{
# your grok's ...
}
else{
mutate{
add_tag => [ "no_asterisk_debug_type" ]
}
}
}

How to filter data with Logstash before storing parsed data in Elasticsearch

I understand that Logstash is for aggregating and processing logs. I have NGIX logs and had Logstash config setup as:
filter {
grok {
match => [ "message" , "%{COMBINEDAPACHELOG}+%{GREEDYDATA:extra_fields}"]
overwrite => [ "message" ]
}
mutate {
convert => ["response", "integer"]
convert => ["bytes", "integer"]
convert => ["responsetime", "float"]
}
geoip {
source => "clientip"
target => "geoip"
add_tag => [ "nginx-geoip" ]
}
date {
match => [ "timestamp" , "dd/MMM/YYYY:HH:mm:ss Z" ]
remove_field => [ "timestamp" ]
}
useragent {
source => "agent"
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "weblogs-%{+YYYY.MM}"
document_type => "nginx_logs"
}
stdout { codec => rubydebug }
}
This would parse the unstructured logs into a structured form of data, and store the data into monthly indexes.
What I discovered is that the majority of logs were contributed by robots/web-crawlers. In python I would filter them out by:
browser_names = browser_names[~browser_names.str.\
match('^[\w\W]*(google|bot|spider|crawl|headless)[\w\W]*$', na=False)]
However, I would like to filter them out with Logstash so I can save a lot of disk space in Elasticsearch server. Is there a way to do that? Thanks in advance!
Thanks LeBigCat for generously giving a hint. I solved this problem by adding the following under the filter:
if [browser_names] =~ /(?i)^[\w\W]*(google|bot|spider|crawl|headless)[\w\W]*$/ {
drop {}
}
the (?i) flag is for case insensitive matching.
In your filter you can ask for drop (https://www.elastic.co/guide/en/logstash/current/plugins-filters-drop.html). As you already got your pattern, should be pretty fast ;)

Kibana. Extract fields from #message containing a JSON

I would like to extract in Kiabana fields from #message field which contains a json.
ex:
Audit{
uuid='xxx-xx-d3sd-fds3-f43',
action='/v1.0/execute/super/method',
resultCode='SUCCESS',
browser='null',
ipAddress='192.168.2.44',
application='application1',
timeTaken='167'
}
Having "action" and "application" fields I hope to be able to find top 5 requests that hits the application.
I started with something similar to this:
filter {
if ([message]~ = "Audit") {
grok {
match => {
"message" => "%{WORD:uuid}, %{WORD:action}, %{WORD:resultCode}, %{WORD:browser}, %{WORD:ipAddress}, %{WORD:application}, %{NUMBER:timeTaken}"
}
add_field => ["action", "%{action}"]
add_field => ["application", "%{application}"]
}
}
}
But it seems to be too far from reality.
If the content of "Audit" is really in json format, you can use the filter plugin "json"
json{
source => "Audit"
}
It will do the parsing for you and creates everything. You don't need grok / add_field.

ELK LDAP log filtering

Two things: Our logs look like this -
May 11 06:51:31 ldap slapd[6694]: conn=1574001 op=1 SRCH base="cn=s_02,ou=users,o=meta" scope=0 deref=0 filter="(...)"
I need to 1) take the time stamp and set it to the left column "time" in Kibana's discover panel and 2) take the number after connection and make it a field so as to be able to order them by number. I've spent all day researching and date and mutate seem promising, but I haven't been able to get them correctly implemented.
The config file looks like this:
input {
file {
path => "/Desktop/logs/*.log"
type => "log"
sincedb_path => "/dev/null"
}
}
output {
elasticsearch {
hosts => "127.0.0.1"
index => "logstash-%{type}-%{+YYYY.MM.dd}"
}
file {
path => "/home/logsOut/%{type}.%{+yyyy.MM.dd.HH.mm}"
}
}
If you only need these two as seperate fields:
filter {
grok {
match => {
"message" => [ "%{SYSLOGBASE} conn=%{INT:conn}" ]
}
}
date {
match => [ "timestamp", "MMM dd HH:mm:ss" ]
target => "time"
}
mutate {
convert => { "conn" => "integer" }
}
}

How to leverage logstash to index data but not generating extra fields from logstash

I am testing ElasticSearch to handle around 1 billion small doc (only 8 fields). When i use logstash to index data, it adds other fields like "message", "#version", "#timestamp" that not useful to my case and seems to consume lots of doc size. Is there a way to only index the fields defined in configuration?
Yes, simply add the following mutate filter in your Logstash configuration:
filter {
mutate {
remove_field => [ "#version", "#timestamp", "message" ]
}
}
Yes, you can add and remove fields to remove use following snippet in your conf file.
filter {
mutate {
remove_field => [ "#timestamp", "message", "#version" ]
}
}
To add new field use following snippet.
filter {
mutate {
add_field => { "foo_%{somefield}" => "Hello world, from %{host}" }
}
}

Resources