How to create field using Logstash and grok plugin - elasticsearch

I have a tomcat log of below format
10.0.6.35 - - [21/Oct/2019:00:00:04 +0000] "GET /rest/V1/productlist/category/4259/ar/final_price/asc/4/20 HTTP/1.1" 200 14970 12
I want to create the field of last two column which is bytes and duration and want to analyze it using Kibana. I had used Filebeat and Logstash for transferring data to the Elasticsearch.
My Logstash configuration file is below:
I had tried with below configuration but not able to see the field on kibana.
input {
beats {
port => 5044
}
}
filter {
grok {
match => ["message" => "%{IP:client} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes}(?m) %{NUMBER:duration}" ]
#match=>{"duration"=> "%{NUMBER:duration}"}
# match => { "message" => "%{COMBINEDAPACHELOG}" }
}
# mutate {
# remove_field => ["#version", "#timestamp"]
# }
date {
match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
}
}
output {
if [fields][log_type] == "access-log"
{
elasticsearch {
hosts => ["172.31.30.73:9200"]
index => "%{[fields][service]}-%{+YYYY.MM.dd}"
}
}
if [fields][log_type] == "application-log"
{
elasticsearch {
hosts => ["172.31.30.73:9200"]
index => "%{[fields][service]}-%{+YYYY.MM.dd}"
}
}
else
{
elasticsearch {
hosts => ["172.31.30.73:9200"]
index => "logstashhh-%{+YYYY.MM.dd}"
}
I want that duration and bytes becomes my field on Kibana for visualization.

Try this as your logstash configuration:
input {
beats {
port => 5044
}
}
filter {
grok {
match => ["message" => "%{NUMBER:bytes}(?m) %{NUMBER:duration}$" ]
#match=>{"duration"=> "%{NUMBER:duration}"}
# match => { "message" => "%{COMBINEDAPACHELOG}" }
}
# mutate {
# remove_field => ["#version", "#timestamp"]
# }
date {
match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
}
}
output {
if [fields][log_type] == "access-log"
{
elasticsearch {
hosts => ["172.31.30.73:9200"]
index => "%{[fields][service]}-%{+YYYY.MM.dd}"
}
}
if [fields][log_type] == "application-log"
{
elasticsearch {
hosts => ["172.31.30.73:9200"]
index => "%{[fields][service]}-%{+YYYY.MM.dd}"
}
}
else
{
elasticsearch {
hosts => ["172.31.30.73:9200"]
index => "logstashhh-%{+YYYY.MM.dd}"
}

Related

logstash don't report all the events

i could see some events are missing while reporting logs to elastic search. Take an example i am sending 5 logs event only 4 or 3 are reporting.
Basically i am using logstash 7.4 to read my log messages and store the information on elastic search 7.4. below is my logstash configuration
input {
file {
type => "web"
path => ["/Users/a0053/Downloads/logs/**/*-web.log"]
start_position => "beginning"
sincedb_path => "/tmp/sincedb_file"
codec => multiline {
pattern => "^(%{MONTHDAY}-%{MONTHNUM}-%{YEAR} %{TIME}) "
negate => true
what => previous
}
}
}
filter {
if [type] == "web" {
grok {
match => [ "message","(?<frontendDateTime>%{MONTHDAY}-%{MONTHNUM}-%{YEAR} %{TIME})%{SPACE}(\[%{DATA:thread}\])?( )?%{LOGLEVEL:level}%{SPACE}%{USERNAME:zhost}%{SPACE}%{JAVAFILE:javaClass} %{USERNAME:orgId} (?<loginId>[\w.+=:-]+#[0-9A-Za-z][0-9A-Za-z-]{0,62}(?:[.](?:[0-9A-Za-z][0-9A-Za-zā€Œā€‹-]{0,62}))*) %{GREEDYDATA:jsonstring}"]
}
json {
source => "jsonstring"
target => "parsedJson"
remove_field=>["jsonstring"]
}
mutate {
add_field => {
"actionType" => "%{[parsedJson][actionType]}"
"errorMessage" => "%{[parsedJson][errorMessage]}"
"actionName" => "%{[parsedJson][actionName]}"
"Payload" => "%{[parsedJson][Payload]}"
"pageInfo" => "%{[parsedJson][pageInfo]}"
"browserInfo" => "%{[parsedJson][browserInfo]}"
"dateTime" => "%{[parsedJson][dateTime]}"
}
}
}
}
output{
if "_grokparsefailure" in [tags]
{
elasticsearch
{
hosts => "localhost:9200"
index => "grokparsefailure-%{+YYYY.MM.dd}"
}
}
else {
elasticsearch
{
hosts => "localhost:9200"
index => "zindex"
}
}
stdout{codec => rubydebug}
}
As keep on new logs are writing to log files, i could see a difference of log counts.
Any suggestions would be appreciated.

Error in grok filter which starting logstash

I have the following logstash conf file
input {
tcp {
port => 12345
codec => json
}
}
filter {
grok {
break_on_match => true
match => [
"message", "%{TIMESTAMP_ISO8601:timestamp} (verbose|info|debug) (hostd|vpxa)",
]
mutate {
add_tag => "esxi_verbose"
}
}
}
if "esxi_verbose" in [tags] {
drop{}
}
output {
stdout { codec => rubydebug }
elasticsearch {
hosts => ["localhost:9200"]
index => "logstash-%{+YYYY.MM.dd}"
}
}
I am trying to drop any verbose, debug, info messages. When I start logstash I get the error
[2019-03-03T16:53:11,731][ERROR][logstash.agent] Failed to execute action {:action=>LogStash::PipelineAction::Create/pipeline_id:main, :exception=>"LogStash::ConfigurationError", :message=>"Expected one of #, \", ', -, [, { at line 13, column 5 (byte 211) after filter {\n grok {\n break_on_match => true\n match => [\n \"message\", \"%{TIMESTAMP_ISO8601:timestamp} (verbose|info|debug) (hostd|vpxa)\",\n "
Can someone help me what I am doing wrong.
you have 3 issues in the config:
there's a comma at the end of the grok message line which is
redundant
the mutate is inside the grok filter, but it should come
after it
the 'if' statement should be inside the 'filter' section.
This is the updated and working config:
input {
tcp {
port => 12345
codec => json
}
}
filter {
grok {
break_on_match => true
match => [
"message", "%{TIMESTAMP_ISO8601:timestamp} (verbose|info|debug) (hostd|vpxa)"
]
}
mutate {
add_tag => "esxi_verbose"
}
if "esxi_verbose" in [tags] {
drop{}
}
}
output {
stdout { codec => rubydebug }
elasticsearch {
hosts => ["localhost:9200"]
index => "logstash-%{+YYYY.MM.dd}"
}
}

Logstash changes original #timestamp value received from filebeat

I noticed that #timestamp field, which is correctly defined by filebeat, is changed automatically by logstash and its value is replaced with a log timestamp value (field name is a_timestamp).
Here is part of logstash debug log:
[2017-07-18T11:55:03,598][DEBUG][logstash.pipeline ] filter received {"event"=>{"#****timestamp"=>2017-07-18T09:54:53.507Z, "offset"=>498, "#version"=>"1", "input_type"=>"log", "beat"=>{"hostname"=>"centos-ea", "name"=>"filebeat_shipper_kp", "version"=>"5.5.0"}, "host"=>"centos-ea", "source"=>"/home/elastic/ELASTIC_NEW/log_bw/test.log", "message"=>"2017-06-05 19:02:46.779 INFO [bwEngThread:In-Memory Process Worker-4] psg.logger - a_applicationName=\"PieceProxy\", a_processName=\"piece.PieceProxy\", a_jobId=\"bw0a10ao\", a_processInstanceId=\"bw0a10ao\", a_level=\"Info\", a_phase=\"ProcessStart\", a_activityName=\"SetAndLog\", a_timeStamp=\"2017-06-05T19:02:46.779\", a_sessionId=\"\", a_sender=\"PCS\", a_cruid=\"37d7e225-bbe5-425b-8abc-f4b44a5a1560\", a_MachineCode=\"CFDM7757\", a_correlationId=\"fa10f\", a_trackingId=\"9d3b8\", a_message=\"START=piece.PieceProxy\"", "type"=>"log", "tags"=>["beats_input_codec_plain_applied"]}}
[2017-07-18T11:55:03,629][DEBUG][logstash.pipeline ] output received {"event"=>{"a_message"=>"START=piece.PieceProxy", "log"=>"INFO ", "bwthread"=>"[bwEngThread:In-Memory Process Worker-4]", "logger"=>"psg.logger ", "a_correlationId"=>"fa10f", "source"=>"/home/elastic/ELASTIC_NEW/log_bw/test.log", "a_trackingId"=>"9d3b8", "type"=>"log", "a_sessionId"=>"\"\"", "a_sender"=>"PCS", "#version"=>"1", "beat"=>{"hostname"=>"centos-ea", "name"=>"filebeat_shipper_kp", "version"=>"5.5.0"}, "host"=>"centos-ea", "a_level"=>"Info", "a_processName"=>"piece.PieceProxy", "a_cruid"=>"37d7e225-bbe5-425b-8abc-f4b44a5a1560", "a_activityName"=>"SetAndLog", "offset"=>498, "a_MachineCode"=>"CFDM7757", "input_type"=>"log", "message"=>"2017-06-05 19:02:46.779 INFO [bwEngThread:In-Memory Process Worker-4] psg.logger - a_applicationName=\"PieceProxy\", a_processName=\"piece.PieceProxy\", a_jobId=\"bw0a10ao\", a_processInstanceId=\"bw0a10ao\", a_level=\"Info\", a_phase=\"ProcessStart\", a_activityName=\"SetAndLog\", a_timeStamp=\"2017-06-05T19:02:46.779\", a_sessionId=\"\", a_sender=\"PCS\", a_cruid=\"37d7e225-bbe5-425b-8abc-f4b44a5a1560\", a_MachineCode=\"CFDM7757\", a_correlationId=\"fa10f\", a_trackingId=\"9d3b8\", a_message=\"START=piece.PieceProxy\"", "a_phase"=>"ProcessStart", "tags"=>["beats_input_codec_plain_applied", "_dateparsefailure", "kv_ok", "taskStarted"], "a_processInstanceId"=>"bw0a10ao", "#timestamp"=>2017-06-05T17:02:46.779Z, "my_index"=>"bw_logs", "a_timeStamp"=>"2017-06-05T19:02:46.779", "a_jobId"=>"bw0a10ao", "a_applicationName"=>"PieceProxy", "TMS"=>"2017-06-05 19:02:46.779"}}
NB:
I noticed that this doesn't happen with a simple pipeline (without grok, kv and other plugins I use in my custom pipeline)
I changed filebeat's property json.overwrite_keys to TRUE but with no success.
Can you explain me why and what happens with #_timestamp changing? I don't expect it to be done automatically (I saw many posts of people asking how to do that) because #timestamp is a system field.. What's wrong with that?
Here is my pipeline:
input {
beats {
port => "5043"
type => json
}
}
filter {
#date {
# match => [ "#timestamp", "ISO8601" ]
# target => "#timestamp"
#}
if "log_bw" in [source] {
grok {
patterns_dir => ["/home/elastic/ELASTIC_NEW/logstash-5.5.0/config/patterns/extrapatterns"]
match => { "message" => "%{CUSTOM_TMS:TMS}\s*%{CUSTOM_LOGLEVEL:log}\s*%{CUSTOM_THREAD:bwthread}\s*%{CUSTOM_LOGGER:logger}-%{CUSTOM_TEXT:text}" }
tag_on_failure => ["no_match"]
}
if "no_match" not in [tags] {
if "Payload for Request is" in [text] {
mutate {
add_field => { "my_index" => "json_request" }
}
grok {
patterns_dir => ["/home/elastic/ELASTIC_NEW/logstash-5.5.0/config/patterns/extrapatterns"]
match => { "text" => "%{CUSTOM_JSON:json_message}" }
}
json {
source => "json_message"
tag_on_failure => ["errore_parser_json"]
target => "json_request"
}
mutate {
remove_field => [ "json_message", "text" ]
}
}
else {
mutate {
add_field => { "my_index" => "bw_logs" }
}
kv {
source => "text"
trim_key => "\s"
field_split => ","
add_tag => [ "kv_ok" ]
}
if "kv_ok" not in [tags] {
drop { }
}
else {
mutate {
remove_field => [ "text" ]
}
if "ProcessStart" in [a_phase] {
mutate {
add_tag => [ "taskStarted" ]
}
}
if "ProcessEnd" in [a_phase] {
mutate {
add_tag => [ "taskTerminated" ]
}
}
date {
match => [ "a_timeStamp", "yyyy'-'MM'-'dd'T'HH:mm:ss.SSS" ]
}
elapsed {
start_tag => "taskStarted"
end_tag => "taskTerminated"
unique_id_field => "a_cruid"
}
}
}
}
}
else {
mutate {
add_field => { "my_index" => "other_products" }
}
}
}
output {
elasticsearch {
index => "%{my_index}"
hosts => ["localhost:9200"]
}
stdout { codec => rubydebug }
file {
path => "/tmp/loggata.tx"
codec => json
}
}
Thank you very much,
Andrea
This was the error (a typo from previous tests):
date {
match => [ "a_timeStamp", "yyyy'-'MM'-'dd'T'HH:mm:ss.SSS" ]
}
Thank you guys, anyway!

:reason=>"Something is wrong with your configuration." GeoIP.dat Mutate Logstash

I have the following configuration for logstash.
There are 3 parts to this one is a generallog which we use for all applications they land in here.
second part is the application stats where in which we have a specific logger which will be configured to push the application statistics
third we have is the click stats when ever an event occurs on client side we may want to push it to the logstash on the upd address.
all 3 are udp based, we also use log4net to to send the logs to the logstash.
the base install did not have a GeoIP.dat file so got the file downloaded from the https://dev.maxmind.com/geoip/legacy/geolite/
have put the file in the /opt/logstash/GeoIPDataFile with a 777 permissions on the file and folder.
second thing is i have a country name and i need a way to show how many users form each country are viewing the application in last 24 hours.
so for that reason we also capture the country name as its in their profile in the application.
now i need a way to get the geo co-ordinates to use the tilemap in kibana.
What am i doing wrong.
if i take the geoIP { source -=> "country" section the logstash works fine.
when i check the
/opt/logstash/bin/logstash -t -f /etc/logstash/conf.d/logstash.conf
The configuration file is ok is what i receive. where am i going worng?
Any help would be great.
input {
udp {
port => 5001
type => generallog
}
udp {
port => 5003
type => applicationstats
}
udp {
port => 5002
type => clickstats
}
}
filter {
if [type] == "generallog" {
grok {
remove_field => message
match => { message => "(?m)%{TIMESTAMP_ISO8601:sourcetimestamp} \[%{NUMBER:threadid}\] %{LOGLEVEL:loglevel} +- %{IPORHOST:requesthost} - %{WORD:applicationname} - %{WORD:envname} - %{GREEDYDATA:logmessage}" }
}
if !("_grokparsefailure" in [tags]) {
mutate {
replace => [ "message" , "%{logmessage}" ]
replace => [ "host" , "%{requesthost}" ]
add_tag => "generalLog"
}
}
}
if [type] == "applicationstats" {
grok {
remove_field => message
match => { message => "(?m)%{TIMESTAMP_ISO8601:sourceTimestamp} \[%{NUMBER:threadid}\] %{LOGLEVEL:loglevel} - %{WORD:envName}\|%{IPORHOST:actualHostMachine}\|%{WORD:applicationName}\|%{NUMBER:empId}\|%{WORD:regionCode}\|%{DATA:country}\|%{DATA:applicationName}\|%{NUMBER:staffapplicationId}\|%{WORD:applicationEvent}" }
}
geoip {
source => "country"
target => "geoip"
database => "/opt/logstash/GeoIPDataFile/GeoIP.dat"
add_field => [ "[geoip][coordinates]", "%{[geoip][longitude]}" ]
add_field => [ "[geoip][coordinates]", "%{[geoip][latitude]}" ]
}
mutate {
convert => [ "[geoip][coordinates]", "float"]
}
if !("_grokparsefailure" in [tags]) {
mutate {
add_tag => "applicationstats"
add_tag => [ "eventFor_%{applicationName}" ]
}
}
}
if [type] == "clickstats" {
grok {
remove_field => message
match => { message => "(?m)%{TIMESTAMP_ISO8601:sourceTimestamp} \[%{NUMBER:threadid}\] %{LOGLEVEL:loglevel} - %{IPORHOST:remoteIP}\|%{IPORHOST:fqdnHost}\|%{IPORHOST:actualHostMachine}\|%{WORD:applicationName}\|%{WORD:envName}\|(%{NUMBER:clickId})?\|(%{DATA:clickName})?\|%{DATA:clickEvent}\|%{WORD:domainName}\\%{WORD:userName}" }
}
if !("_grokparsefailure" in [tags]) {
mutate {
add_tag => "clicksStats"
add_tag => [ "eventFor_%{clickName}" ]
}
}
}
}
output {
if [type] == "applicationstats" {
elasticsearch {
hosts => "localhost:9200"
index => "applicationstats-%{+YYYY-MM-dd}"
template => "/opt/logstash/templates/udp-applicationstats.json"
template_name => "applicationstats"
template_overwrite => true
}
}
else if [type] == "clickstats" {
elasticsearch {
hosts => "localhost:9200"
index => "clickstats-%{+YYYY-MM-dd}"
template => "/opt/logstash/templates/udp-clickstats.json"
template_name => "clickstats"
template_overwrite => true
}
}
else if [type] == "generallog" {
elasticsearch {
hosts => "localhost:9200"
index => "generallog-%{+YYYY-MM-dd}"
template => "/opt/logstash/templates/udp-generallog.json"
template_name => "generallog"
template_overwrite => true
}
}
else{
elasticsearch {
hosts => "localhost:9200"
index => "logstash-%{+YYYY-MM-dd}"
}
}
}
As per the error message, the mutation which you're trying to do could be wrong. Could you please change your mutate as below:
mutate {
convert => { "geoip" => "float" }
convert => { "coordinates" => "float" }
}
I guess you've given the mutate as an array, and it's a hash type by origin. Try converting both the values individually. Your database path for geoip seems to be fine in your filter. Is that the whole error which you've mentioned in the question? If not update the question with the whole error if possible.
Refer here, for in depth explanations.

logstash geoip not working for IPv4

I am indexing syslogs into elasticsearch using logstash[version 2.2],I am also using geoip to get the source and destination address but in certain logs the the geoip doesn't seem to work
**config file:**
input {
tcp {
type => syslog
port => 8001
}
udp {
type => syslog
port => 8001
}
filter {
if [type] == "syslog" {
grok {
match => {
"message" => "\<%{NUMBER:number}\>%{timestamp:timestamp} %{WORD:logType}: %{NUMBER:ruleNumber},%{NUMBER:subRuleNumber}%{DATA}%{NUMBER:tracker},%{WORD:realinterface},%{WORD:reasonForTheLogEntry},%{WORD:actionTakenThatResultedInTheLogEntry},%{WORD:directionOfTheTraffic},%{NUMBER:IPversion},%{DATA:class},%{DATA:flowLabel},%{NUMBER:hopLimit},%{WORD:protocol},%{NUMBER:protocolID},%{NUMBER:length},%{IPV6:srcIP},%{IPV6:destIP},%{NUMBER:srcPort},%{NUMBER:destPort},%{NUMBER:dataLength}"
}
add_field => { "event" => "name" }
}
}
geoip {
source => "srcIP"
target => "geoSrc"
}
geoip {
source => "destIP"
target => "geoDest"
}
geoip {
source => "icmpDetinationIP"
target => "icmpDest"
}
}
output {
csv {
fields => "message"
path => "/data/streamed-logs/%{[host]}-%{+YYYY-MM-dd}.log"
}
stdout {
codec => "rubydebug"
}
elasticsearch {
hosts => "address"
}
}
**address having problem with geoIP:**
I can't get the geoIP for addresses which are in this format e80::c0d3:531b:f0cf:f546
You need to use the IPV6 grok pattern instead of the IPV4
grok {
match => {
"message" => "...%{IPV6:srcIP},%{IPV6:destIP},%{IPV6:icmpDetinationIP}..."
^ ^ ^
| | |
here here and here
}
}

Resources