Show Kafka topic title as a field in Kibana, logstash add_field? - elasticsearch

I have logstash with ElasticSearch & Kibana 7.6.2
I connect logstash to Kafka as follows:
input {
kafka {
bootstrap_servers => "******"
topics_pattern => [".*"]
decorate_events => true
add_field => { "[topic_name]" => "%{[#metadata][kafka][topic]}"}
}
}
filter {
date {
match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "logstash"
document_type => "logs"
}
}
It's OK and work. But I field topic_name show as %{[#metadata][kafka][topic]}
How can I fix it?

The syntax of the sprintf format you are using ( %{[#metadata][kafka][topic]} ) to get the value of that field is correct.
Allegedly there is no such field #metadata.kafka.topic in your document. Therefore the sprintf can't obtain the field value and as a result, the newly created field contains the sprintf call as a string.
However, since you set decorate_events => true, the metadata fields should be available as stated in the documentation (https://www.elastic.co/guide/en/logstash/current/plugins-inputs-kafka.html):
Metadata is only added to the event if the decorate_events option is set to true (it defaults to false).
I can imagine that the add_field action set in the input plugin causes the issue. Since the decorate_events option first enables the addition of the metadata fields, the add_field action should come at second place after the input plugin.
Your configuration would then look like this:
input {
kafka {
bootstrap_servers => "******"
topics_pattern => [".*"]
decorate_events => true
}
}
filter {
mutate{
add_field => { "[topic_name]" => "%{[#metadata][kafka][topic]}"}
}
date {
match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "logstash"
document_type => "logs"
}
}

How about
add_field => { "topic_name" => "%{[#metadata][kafka][topic]}"}
i.e. [topic_name] -> topic_name

Related

Removing grok matched field after using it

I use filebeat to fetch log files into my logstash and then filter unnecessary fields. Everything works fine and I output these into elasticsearch but there is a field which I use for elasticsearch index name, I define this variable in my grok match but I couldn't find a way to remove that variable once it serves its purpose. I'll share my logstash config below
input {
beats {
port => "5044"
}
}
filter {
grok {
match => { "[log][file][path]" => ".*(\\|\/)(?<myIndex>.*)(\\|\/).*.*(\\|\/).*(\\|\/).*(\\|\/).*(\\|\/)" }
}
json {
source => message
}
mutate {
remove_field => ["agent"]
remove_field => ["input"]
remove_field => ["#metadata"]
remove_field => ["log"]
remove_field => ["tags"]
remove_field => ["host"]
remove_field => ["#version"]
remove_field => ["message"]
remove_field => ["event"]
remove_field => ["ecs"]
}
date {
match => ["t","yyyy-MM-dd HH:mm:ss.SSS"]
remove_field => ["t"]
}
mutate {
rename => ["l","log_level"]
rename => ["mt","msg_template"]
rename => ["p","log_props"]
}
}
output {
elasticsearch {
hosts => [ "localhost:9222" ]
index => "%{myIndex}"
}
stdout { codec => rubydebug { metadata => true } }
}
I just want to remove the "myIndex" field from my index. With this config file, I see this field in elasticsearch if possible I want to remove it. I've tried to remove it with other fields altogether but it gave an error. I guess it's because I removed it before logstash could give it to elasticsearch.
Create the field under [#metadata]. Those fields are available to use in logstash but are ignored by outputs unless they use a rubydebug codec.
Adjust your grok filter
match => { "[log][file][path]" => ".*(\\|\/)(?<[#metadata][myIndex]>.*)(\\|\/).*.*(\\|\/).*(\\|\/).*(\\|\/).*(\\|\/)" }
Delete [#metadata] from the mutate+remove_field and change the output configuration to have
index => "%{[#metadata][myIndex]}"

Use non whitelisted fields in Logstash if statement

I have a logstash configuration where I used prune to whitelist few fields.
prune {
whitelist_names => ["id","name"]
}
The problem is, I need to use an if condition in the output on a field other than id field, eg: "type". But since I have not whitelisted "type", the if condition is not working.
if ( [type] in ["abc","efg"] ) {
elasticsearch {
action => "update"
hosts => [ "localhost:9200" ]
index => "index"
document_id => "%{id}"
doc_as_upsert => true
}
}
How can I use non whitelisted field in if condition?
Before your prune filter, add a mutate filter to copy the value of the field you're going to delete (type) into a new metadata field. Then, prune. Then, use the new metadata field in your output condition.
...
filter {
...
mutate {
add_field => {
"[#metadata][type]" => "%{type}"
}
}
prune {
whitelist_names => ["id","name"]
}
...
}
output {
if [#metadata][type] in ["abc","efg"] {
elasticsearch {
action => "update"
hosts => [ "localhost:9200" ]
index => "index"
document_id => "%{id}"
doc_as_upsert => true
}
}
}

Logstash (Extractic parts of fields using regex)

I am using the Kafka plugin to input data into logstash from kafka.
input {
kafka {
bootstrap_servers => ["{{ kafka_bootstrap_server }}"]
codec => "json"
group_id => "{{ kafka_consumer_group_id }}"
auto_offset_reset => "earliest"
topics_pattern => ".*" <- This line ensures it reads from all kafka topics
decorate_events => true
add_field => { "[#metadata][label]" => "kafka-read" }
}
}
The kafka topics are of the format
ingest-abc &
ingest-xyz
I use the following filter to specify the ES index where it should end up by setting the [#metadata][index_prefix] field.
filter {
mutate {
add_field => {
"[#metadata][index_prefix]" => "%{[#metadata][kafka][topic]}"
}
remove_field => ["[kafka][partition]", "[kafka][key]"]
}
if [message] {
mutate {
add_field => { "[pipeline_metadata][normalizer][original_raw_message]" => "%{message}" }
}
}
}
So my es indexes end up being
ingest-abc-YYYY-MM-DD
ingest-xyz-YYYY-MM-DD
How do I set the index_prefix to
abc-YYYY-MM-DD & xyz-YYYY-MM-DD instead
by getting rid of the commong ingest- prefix
The regex that matches it is: (?!ingest)\b(?!-)\S+
But I am not sure where it would fit in the config.
Thanks!
Ok so I figured it out if anyone ever stumbles on a similar problem,
I basically used a gsub filter instead of filters and grok,
This replaces any matching text with the passed text in argument3
filter {
mutate {
rename => { "[#metadata][kafka]" => "kafka"}
gsub => [ "[#metadata][index_prefix]", "ingest-", "" ]
}
}

logstash 5.0.1: setup elasticsearch multiple indexes ouput for multiple kafka input topics

I have a logstash input setup as
input {
kafka {
bootstrap_servers => "zookeper_address"
topics => ["topic1","topic2"]
}
}
I need to feed the topics into two different indexes in elasticsearch. Can anyone help me with how the ouput should be setup for such a task. At this time I am only able to setup
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "my_index"
codec => "json"
document_id => "%{id}"
}
}
I need two indexes on the same elasticsearch instance say index1 and index2 which will be fed by messages coming in on topic1 and topic2
First, you need to add decorate_events to your kafka input in order to know from which topic the message is coming
input {
kafka {
bootstrap_servers => "zookeper_address"
topics => ["topic1","topic2"]
decorate_events => true
}
}
Then, you have two options, both involving conditional logic. The first is by introducing a filter for adding the correct index name depending on the topic name. For this you need to add
filter {
if [kafka][topic] == "topic1" {
mutate {
add_field => {"[#metadata][index]" => "index1"}
}
} else {
mutate {
add_field => {"[#metadata][index]" => "index2"}
}
}
# remove the field containing the decorations, unless you want them to land into ES
mutate {
remove_field => ["kafka"]
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "%{[#metadata][index]}"
codec => "json"
document_id => "%{id}"
}
}
Then second option is to do the if/else directly in the output section, like this (but the additional kafka field will land into ES):
output {
if [#metadata][kafka][topic] == "topic1" {
elasticsearch {
hosts => ["localhost:9200"]
index => "index1"
codec => "json"
document_id => "%{id}"
}
} else {
elasticsearch {
hosts => ["localhost:9200"]
index => "index2"
codec => "json"
document_id => "%{id}"
}
}
}

:reason=>"Something is wrong with your configuration." GeoIP.dat Mutate Logstash

I have the following configuration for logstash.
There are 3 parts to this one is a generallog which we use for all applications they land in here.
second part is the application stats where in which we have a specific logger which will be configured to push the application statistics
third we have is the click stats when ever an event occurs on client side we may want to push it to the logstash on the upd address.
all 3 are udp based, we also use log4net to to send the logs to the logstash.
the base install did not have a GeoIP.dat file so got the file downloaded from the https://dev.maxmind.com/geoip/legacy/geolite/
have put the file in the /opt/logstash/GeoIPDataFile with a 777 permissions on the file and folder.
second thing is i have a country name and i need a way to show how many users form each country are viewing the application in last 24 hours.
so for that reason we also capture the country name as its in their profile in the application.
now i need a way to get the geo co-ordinates to use the tilemap in kibana.
What am i doing wrong.
if i take the geoIP { source -=> "country" section the logstash works fine.
when i check the
/opt/logstash/bin/logstash -t -f /etc/logstash/conf.d/logstash.conf
The configuration file is ok is what i receive. where am i going worng?
Any help would be great.
input {
udp {
port => 5001
type => generallog
}
udp {
port => 5003
type => applicationstats
}
udp {
port => 5002
type => clickstats
}
}
filter {
if [type] == "generallog" {
grok {
remove_field => message
match => { message => "(?m)%{TIMESTAMP_ISO8601:sourcetimestamp} \[%{NUMBER:threadid}\] %{LOGLEVEL:loglevel} +- %{IPORHOST:requesthost} - %{WORD:applicationname} - %{WORD:envname} - %{GREEDYDATA:logmessage}" }
}
if !("_grokparsefailure" in [tags]) {
mutate {
replace => [ "message" , "%{logmessage}" ]
replace => [ "host" , "%{requesthost}" ]
add_tag => "generalLog"
}
}
}
if [type] == "applicationstats" {
grok {
remove_field => message
match => { message => "(?m)%{TIMESTAMP_ISO8601:sourceTimestamp} \[%{NUMBER:threadid}\] %{LOGLEVEL:loglevel} - %{WORD:envName}\|%{IPORHOST:actualHostMachine}\|%{WORD:applicationName}\|%{NUMBER:empId}\|%{WORD:regionCode}\|%{DATA:country}\|%{DATA:applicationName}\|%{NUMBER:staffapplicationId}\|%{WORD:applicationEvent}" }
}
geoip {
source => "country"
target => "geoip"
database => "/opt/logstash/GeoIPDataFile/GeoIP.dat"
add_field => [ "[geoip][coordinates]", "%{[geoip][longitude]}" ]
add_field => [ "[geoip][coordinates]", "%{[geoip][latitude]}" ]
}
mutate {
convert => [ "[geoip][coordinates]", "float"]
}
if !("_grokparsefailure" in [tags]) {
mutate {
add_tag => "applicationstats"
add_tag => [ "eventFor_%{applicationName}" ]
}
}
}
if [type] == "clickstats" {
grok {
remove_field => message
match => { message => "(?m)%{TIMESTAMP_ISO8601:sourceTimestamp} \[%{NUMBER:threadid}\] %{LOGLEVEL:loglevel} - %{IPORHOST:remoteIP}\|%{IPORHOST:fqdnHost}\|%{IPORHOST:actualHostMachine}\|%{WORD:applicationName}\|%{WORD:envName}\|(%{NUMBER:clickId})?\|(%{DATA:clickName})?\|%{DATA:clickEvent}\|%{WORD:domainName}\\%{WORD:userName}" }
}
if !("_grokparsefailure" in [tags]) {
mutate {
add_tag => "clicksStats"
add_tag => [ "eventFor_%{clickName}" ]
}
}
}
}
output {
if [type] == "applicationstats" {
elasticsearch {
hosts => "localhost:9200"
index => "applicationstats-%{+YYYY-MM-dd}"
template => "/opt/logstash/templates/udp-applicationstats.json"
template_name => "applicationstats"
template_overwrite => true
}
}
else if [type] == "clickstats" {
elasticsearch {
hosts => "localhost:9200"
index => "clickstats-%{+YYYY-MM-dd}"
template => "/opt/logstash/templates/udp-clickstats.json"
template_name => "clickstats"
template_overwrite => true
}
}
else if [type] == "generallog" {
elasticsearch {
hosts => "localhost:9200"
index => "generallog-%{+YYYY-MM-dd}"
template => "/opt/logstash/templates/udp-generallog.json"
template_name => "generallog"
template_overwrite => true
}
}
else{
elasticsearch {
hosts => "localhost:9200"
index => "logstash-%{+YYYY-MM-dd}"
}
}
}
As per the error message, the mutation which you're trying to do could be wrong. Could you please change your mutate as below:
mutate {
convert => { "geoip" => "float" }
convert => { "coordinates" => "float" }
}
I guess you've given the mutate as an array, and it's a hash type by origin. Try converting both the values individually. Your database path for geoip seems to be fine in your filter. Is that the whole error which you've mentioned in the question? If not update the question with the whole error if possible.
Refer here, for in depth explanations.

Resources