Logstash issues in creating index remove .raw field in kibana - elasticsearch

I have written a logstash conf filefor reading logs. If I use the default index, that is logstash-*, I could see .raw field in kibana. However, if I create a new index in conf file in logstash like
output{
elasticsearch {
hosts => "localhost"
index => "batchjob-*"}
}
Then the new index cant configure .raw field. Is there any resolve ways to solve it? Great Thanks.

The raw fields are created by a specific index template that the Logstash elasticsearch output creates in Elasticsearch.
What you can do is simply copy that template to a file named batchjob.json and change the template name to batchjob-* (see below)
{
"template" : "batchjob-*",
"settings" : {
"index.refresh_interval" : "5s"
},
"mappings" : {
"_default_" : {
"_all" : {"enabled" : true, "omit_norms" : true},
"dynamic_templates" : [ {
"message_field" : {
"match" : "message",
"match_mapping_type" : "string",
"mapping" : {
"type" : "string", "index" : "analyzed", "omit_norms" : true,
"fielddata" : { "format" : "disabled" }
}
}
}, {
"string_fields" : {
"match" : "*",
"match_mapping_type" : "string",
"mapping" : {
"type" : "string", "index" : "analyzed", "omit_norms" : true,
"fielddata" : { "format" : "disabled" },
"fields" : {
"raw" : {"type": "string", "index" : "not_analyzed", "ignore_above" : 256}
}
}
}
} ],
"properties" : {
"#timestamp": { "type": "date" },
"#version": { "type": "string", "index": "not_analyzed" },
"geoip" : {
"dynamic": true,
"properties" : {
"ip": { "type": "ip" },
"location" : { "type" : "geo_point" },
"latitude" : { "type" : "float" },
"longitude" : { "type" : "float" }
}
}
}
}
}
}
Then you can modify your elasticsearch output like this:
output {
elasticsearch {
hosts => "localhost"
index => "batchjob-*"
template_name => "batchjob"
template => "/path/to/batchjob.json"
}
}

Related

Ingesting data from Spark to Elasticsearch with index template

In our existing design we are using logstash to fetch data from Kafka (JSON) and put it in ElasticSearch.
We are also using index template mapping while inserting data from logstash to ES and this could be done by setting 'template' property of ES output plugin of logstash, e.g.,
output {
elasticsearch {
template => "elasticsearch-template.json", //template file path
hosts => "localhost:9200"
template_overwrite => true
manage_template => true
codec=>plain
}
}
elasticsearch-template.json looks like below,
{
"template" : "logstash-*",
"settings" : {
"index.refresh_interval" : "3s"
},
"mappings" : {
"_default_" : {
"_all" : {"enabled" : true},
"dynamic_templates" : [ {
"string_fields" : {
"match" : "*",
"match_mapping_type" : "string",
"mapping" : {
"type" : "string", "index" : "analyzed", "omit_norms" : true,
"fields" : {
"raw" : {"type": "string", "index" : "not_analyzed", "ignore_above" : 256, "doc_values":true}
}
}
}
} ],
"properties" : {
"#version": { "type": "string", "index": "not_analyzed" },
"geoip" : {
"type" : "object",
"dynamic": true,
"properties" : {
"location" : { "type" : "geo_point" }
}
}
}
}
}
}
Now we are going to replace logstash with Apache Spark and I want to use similar kind of usage of index template in Spark while inserting data to ES.
I am using elasticsearch-spark_2.11 library for this implementation.
Thanks.

How to do ES Moving Avearge Prediction with Logstash?

I am using Elasticsearch 2.3.2, and Logstash 2.3.3. I have found from https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-pipeline-movavg-aggregation.html which states that moving average can do predictions. I know it is possible to only make query in ES, but I am not sure how should I do that with logstash.
I have a logstash file which reads a csv log file storing CPU usage for every 15 seconds. Should I just include the following into the logstash output json file for the related index as an output mapping?
{
"the_movavg":{
"moving_avg":{
"buckets_path": "the_sum",
"window" : 30,
"model" : "holt_winters",
"settings" : {
"type" : "mult",
"alpha" : 0.5,
"beta" : 0.5,
"gamma" : 0.5,
"period" : 7,
"pad" : true
}
}
}
This is my json file for logstash
{
"template" : "linux_cpu-*",
"settings" : {
"index.refresh_interval" : "5s"
},
"mappings" : {
"_default_" : {
"_all" : {"enabled" : true, "omit_norms" : true},
"dynamic_templates" : [ {
"message_field" : {
"match" : "message",
"match_mapping_type" : "string",
"mapping" : {
"type" : "string", "index" : "analyzed", "omit_norms" : true,
"fielddata" : { "format" : "disabled" }
}
}
}, {
"string_fields" : {
"match" : "*",
"match_mapping_type" : "string",
"mapping" : {
"type" : "string", "index" : "analyzed", "omit_norms" : true,
"fielddata" : { "format" : "disabled" },
"fields" : {
"raw" : {"type": "string", "index" : "not_analyzed", "ignore_above" : 256}
}
}
}
} ],
"properties" : {
"#timestamp": { "type": "date" },
"#version": { "type": "string", "index": "not_analyzed" },
"geoip" : {
"dynamic": true,
"properties" : {
"ip": { "type": "ip" },
"location" : { "type" : "geo_point" },
"latitude" : { "type" : "float" },
"longitude" : { "type" : "float" }
}
}
}
}
}
}
And is it possible to have it as a graph as to be shown in Kibana?

Is it possible to set custom mapping for index in logstash but not in elasticsearch?

There's input, filter and then output in Logstash main coding.
Is it possible to set custom mapping in
output
{ elasticsearch {
}
If it is possible, how do I set it?
With this example:
"mappings" : {
"_default_" : {
"properties" : {
"service" : { "type" : "integer" },
"rule" : { "type" : "integer" },
"ICMP Type" : { "type" : "integer" },
"ICMP Code" : { "type" : "integer" },
"ip_offset" : { "type" : "integer" },
"ip_id" : { "type" : "integer" },
"ip_len" : { "type" : "integer" },
"Confidence Level" : { "type" : "integer" },
"fragments_dropped" : { "type" : "integer" },
"Severity" : { "type" : "integer" },
"serial_num" : { "type" : "integer" },
"during_sec" : { "type" : "integer" },
"Attack info" : {"type": "string", "index" : "not_analyzed" },
"peer gateway" : {"type": "string", "index" : "not_analyzed" }
Logstash comes with a default template that is used when writing documents to elasticsearch.
If you'd like to change the default, you can update your config and pass it the location of a template file.
https://www.elastic.co/guide/en/logstash/current/plugins-outputs-elasticsearch.html#plugins-outputs-elasticsearch-template
You can use template and template_overwrite fields like that :
elasticsearch {
template => "/tttttttttttt/elasticsearch-logstash-template.json"
index => "logstash-%{+YYYY.MM.dd}"
cluster=>"cluster"
template_overwrite => true
}

How to map geoip field in logstash with elasticsearch in order to display it in tile map of Kibana4

I'd like to display geoip fields in tile map of Kibana4.
Using the standard / automatic logstash geoip mapping to elasticsearch it all works fine.
However when creating a non-standard geoip field, I am not quite sure how to customize the elasticsearch-template.json in logstash in order to represent this field correctly in elasticsearch so that it can be chosen in Kibana4 for tile map creation.
Sure, customizing the standard template is not the best way - better create a custom template and point to it in elasticsearch output of logstash.conf. I just quickly wanted to check how the mapping has to be defined, so I modified the standard template.
My logstash.conf:
input {
tcp {
port => 514
type => syslog
}
udp {
port => 514
type => syslog
}
}
filter {
# Standard geoip field is automatically mapped by logstash to
# elastic search by using the elasticsearch-template.json file
geoip { source => "host" }
grok {
match => [
"message", "<%{POSINT:syslog_pri}>%{YEAR} %{SYSLOGTIMESTAMP:syslog_timestamp} %{DATA:device} <%{POSINT:status}> %{WORD:activity} %{DATA:inout} \(%{DATA:msg}\) Src:%{IPV4:src} SPort:%{INT:sport} Dst:%{IPV4:dst} DPort:%{INT:dport} IPP:%{INT:ipp} Rule:%{INT:rule} Interface:%{WORD:iface}",
"message", "<%{POSINT:syslog_pri}>%{YEAR} %{SYSLOGTIMESTAMP:syslog_timestamp} %{DATA:device} <%{POSINT:status}> %{WORD:activity} %{DATA:inout} \(%{DATA:msg}\) Src:%{IPV4:src} Dst:%{IPV4:dst} IPP:%{INT:ipp} Rule:%{INT:rule} Interface:%{WORD:iface}",
"message", "<%{POSINT:syslog_pri}>%{YEAR} %{SYSLOGTIMESTAMP:syslog_timestamp} %{DATA:device} <%{POSINT:status}> %{WORD:activity} %{DATA:inout} \(%{DATA:msg}\) Src:%{IPV4:src} Dst:%{IPV4:dst} Type:%{POSINT:type} Code:%{INT:code} IPP:%{INT:ipp} Rule:%{INT:rule} Interface:%{WORD:iface}"
]
}
# Is not mapped automatically by logstash in that it can be
# chosen in Kibana4 for tile map creation
geoip {
source => "src"
target => "src_geoip"
}
}
output {
elasticsearch {
host => "localhost"
protocol => "http"
}
}
My ...logstash-1.4.2\lib\logstash\outputs\elasticsearch\elasticsearch-template.json:
{
"template" : "logstash-*",
"settings" : {
"index.refresh_interval" : "5s"
},
"mappings" : {
"_default_" : {
"_all" : {"enabled" : true},
"dynamic_templates" : [ {
"string_fields" : {
"match" : "*",
"match_mapping_type" : "string",
"mapping" : {
"type" : "string", "index" : "analyzed", "omit_norms" : true,
"fields" : {
"raw" : {"type": "string", "index" : "not_analyzed", "ignore_above" : 256}
}
}
}
} ],
"properties" : {
"#version": { "type": "string", "index": "not_analyzed" },
"geoip" : {
"type" : "object",
"dynamic": true,
"path": "full",
"properties" : {
"location" : { "type" : "geo_point" }
}
},
"src_geoip" : {
"type" : "object",
"dynamic": true,
"path": "full",
"properties" : {
"location" : { "type" : "geo_point" }
}
}
}
}
}
}
UPDATE: I havent figured out yet when this json file gets applied in elasticsearch. I followed the hints outlined in this question and copied the json file to a config/templates folder in elasticsearch directory. After deleting the indizes and restart of elasticsearch, the template was applied successfully.
Anyway, the field "src_geoip.location" still does not show up in the tile map creation form of Kibana4 (only the standard geoip.location field does).
Try overwrite template after editing template. Re-create indexes in Kibana after config change.
output {
elasticsearch {
template_overwrite => "true"
...
}
}
You also need to add objects for the src_geoip object in the index template on your elasticsearch instance. To set the default template for all indexes that match "logstash-netflow-*", execute the following on your elasticsearch instance:
curl -XPUT localhost:9200/_template/logstash-netflow -d '{
"template" : "logstash-netflow-*",
"mappings" : {
"_default_" : {
"_all" : {
"enabled" : false
},
"properties" : {
"#timestamp" : { "index" : "analyzed", "type" : "date" },
"#version" : { "index" : "analyzed", "type" : "integer" },
"src_geoip" : {
"dynamic" : true,
"type" : "object",
"properties" : {
"area_code" : { "type" : "long" },
"city_name" : { "type" : "string" },
"continent_code" : { "type" : "string" },
"country_code2" : { "type" : "string" },
"country_code3" : { "type" : "string" },
"country_name" : { "type" : "string" },
"dma_code" : { "type" : "long" },
"ip" : { "type" : "string" },
"latitude" : { "type" : "double" },
"location" : { "type" : "double" },
"longitude" : { "type" : "double" },
"postal_code" : { "type" : "string" },
"real_region_name" : { "type" : "string" },
"region_name" : { "type" : "string" },
"timezone" : { "type" : "string" }
}
},
"netflow" : { ....snipped......
}
}
}
}}'

How to use _timestamp in logstash elasticsearch

I am trying to figure out how to use the _timestamp with logstash.
I have tried to add to the mapping:
"_timestamp" : {
"enabled" : true,
"path" : "#timestamp"
},
But that does not have the expected effect. I did this in the elasticsearch-template.json file (I tried with and without the "store"=true):
{
"template" : "logstash-*",
"settings" : {
"index.refresh_interval" : "5s"
},
"mappings" : {
"_default_" : {
"_timestamp" : {
"enabled" : true,
"store" : true,
"path" : "#timestamp"
},
"_all" : {"enabled" : true},
"dynamic_templates" : [ {
.....
And I added the modified file to the output filter
output {
elasticsearch_http {
template => '/tmp/elasticsearch-template.json'
host => '127.0.0.1'
port=>9200
}
}
In order to make sure the database is clean I repeatedly do:
curl -XDELETE http://localhost:9200/logstash*
curl -XDELETE http://localhost:9200/_template/logstash
rm ~/.sincedb_*
and then I try to import my logfile. But for some reasons, the _timestamp is not set.
The mapping seems to be ok
{
"logstash-2014.03.24" : {
"_default_" : {
"dynamic_templates" : [ {
"string_fields" : {
"mapping" : {
"index" : "analyzed",
"omit_norms" : true,
"type" : "string",
"fields" : {
"raw" : {
"index" : "not_analyzed",
"ignore_above" : 256,
"type" : "string"
}
}
},
"match" : "*",
"match_mapping_type" : "string"
}
} ],
"_timestamp" : {
"enabled" : true,
"store" : true,
"path" : "#timestamp"
},
"properties" : {
"#version" : {
"type" : "string",
"index" : "not_analyzed",
"omit_norms" : true,
"index_options" : "docs"
},
"geoip" : {
"dynamic" : "true",
"properties" : {
"location" : {
"type" : "geo_point"
}
}
}
}
},
"logs" : {
"dynamic_templates" : [ {
"string_fields" : {
"mapping" : {
"index" : "analyzed",
"omit_norms" : true,
"type" : "string",
"fields" : {
"raw" : {
"index" : "not_analyzed",
"ignore_above" : 256,
"type" : "string"
}
}
},
"match" : "*",
"match_mapping_type" : "string"
}
} ],
"_timestamp" : {
"enabled" : true,
"store" : true,
"path" : "#timestamp"
},
"properties" : {
"#timestamp" : {
"type" : "date",
"format" : "dateOptionalTime"
},
The documents in the database look like
{
"_id": "Cps2Lq1nTIuj_VysOwwcWw",
"_index": "logstash-2014.03.25",
"_score": 1.0,
"_source": {
"#timestamp": "2014-03-25T00:47:09.703Z",
"#version": "1",
"created": "2014-03-25 01:47:09,703",
"host": "macbookpro.fritz.box",
"message": "2014-03-25 01:47:09,703 - Starting new HTTP connection (1): localhost",
"path": "/Users/scharf/git/ckann/annotator-store/logs/requests.log",
"text": "Starting new HTTP connection (1): localhost"
},
"_type": "logs"
},
why is the _timestamp not set???
In short, it does work.
I tested your exact scenario and here's what I found:
When using _source enabled and specifying _timestamp from some path in the _source,
you will never see _timestamp as part of the document, but if however, you add the ?fields query string part, for example:
http://<localhost>:9200/es_test_logs/ESTest1/ilq4PU3tR9SeoLo794wZlg?fields=_timestamp
you will get the correct _timestamp value.
If, instead of using path, you pass _timestamp externally (in the _source document), you will see _timestamp under the _source property in the document as normal.
If you disable the _source field, you will not see ANY property at all in the document, even those you set as "store" : true. You will only see them when specifying ?fields, or when building a query that returns those fields.

Resources