How to use _timestamp in logstash elasticsearch - elasticsearch

I am trying to figure out how to use the _timestamp with logstash.
I have tried to add to the mapping:
"_timestamp" : {
"enabled" : true,
"path" : "#timestamp"
},
But that does not have the expected effect. I did this in the elasticsearch-template.json file (I tried with and without the "store"=true):
{
"template" : "logstash-*",
"settings" : {
"index.refresh_interval" : "5s"
},
"mappings" : {
"_default_" : {
"_timestamp" : {
"enabled" : true,
"store" : true,
"path" : "#timestamp"
},
"_all" : {"enabled" : true},
"dynamic_templates" : [ {
.....
And I added the modified file to the output filter
output {
elasticsearch_http {
template => '/tmp/elasticsearch-template.json'
host => '127.0.0.1'
port=>9200
}
}
In order to make sure the database is clean I repeatedly do:
curl -XDELETE http://localhost:9200/logstash*
curl -XDELETE http://localhost:9200/_template/logstash
rm ~/.sincedb_*
and then I try to import my logfile. But for some reasons, the _timestamp is not set.
The mapping seems to be ok
{
"logstash-2014.03.24" : {
"_default_" : {
"dynamic_templates" : [ {
"string_fields" : {
"mapping" : {
"index" : "analyzed",
"omit_norms" : true,
"type" : "string",
"fields" : {
"raw" : {
"index" : "not_analyzed",
"ignore_above" : 256,
"type" : "string"
}
}
},
"match" : "*",
"match_mapping_type" : "string"
}
} ],
"_timestamp" : {
"enabled" : true,
"store" : true,
"path" : "#timestamp"
},
"properties" : {
"#version" : {
"type" : "string",
"index" : "not_analyzed",
"omit_norms" : true,
"index_options" : "docs"
},
"geoip" : {
"dynamic" : "true",
"properties" : {
"location" : {
"type" : "geo_point"
}
}
}
}
},
"logs" : {
"dynamic_templates" : [ {
"string_fields" : {
"mapping" : {
"index" : "analyzed",
"omit_norms" : true,
"type" : "string",
"fields" : {
"raw" : {
"index" : "not_analyzed",
"ignore_above" : 256,
"type" : "string"
}
}
},
"match" : "*",
"match_mapping_type" : "string"
}
} ],
"_timestamp" : {
"enabled" : true,
"store" : true,
"path" : "#timestamp"
},
"properties" : {
"#timestamp" : {
"type" : "date",
"format" : "dateOptionalTime"
},
The documents in the database look like
{
"_id": "Cps2Lq1nTIuj_VysOwwcWw",
"_index": "logstash-2014.03.25",
"_score": 1.0,
"_source": {
"#timestamp": "2014-03-25T00:47:09.703Z",
"#version": "1",
"created": "2014-03-25 01:47:09,703",
"host": "macbookpro.fritz.box",
"message": "2014-03-25 01:47:09,703 - Starting new HTTP connection (1): localhost",
"path": "/Users/scharf/git/ckann/annotator-store/logs/requests.log",
"text": "Starting new HTTP connection (1): localhost"
},
"_type": "logs"
},
why is the _timestamp not set???

In short, it does work.
I tested your exact scenario and here's what I found:
When using _source enabled and specifying _timestamp from some path in the _source,
you will never see _timestamp as part of the document, but if however, you add the ?fields query string part, for example:
http://<localhost>:9200/es_test_logs/ESTest1/ilq4PU3tR9SeoLo794wZlg?fields=_timestamp
you will get the correct _timestamp value.
If, instead of using path, you pass _timestamp externally (in the _source document), you will see _timestamp under the _source property in the document as normal.
If you disable the _source field, you will not see ANY property at all in the document, even those you set as "store" : true. You will only see them when specifying ?fields, or when building a query that returns those fields.

Related

How to do ES Moving Avearge Prediction with Logstash?

I am using Elasticsearch 2.3.2, and Logstash 2.3.3. I have found from https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-pipeline-movavg-aggregation.html which states that moving average can do predictions. I know it is possible to only make query in ES, but I am not sure how should I do that with logstash.
I have a logstash file which reads a csv log file storing CPU usage for every 15 seconds. Should I just include the following into the logstash output json file for the related index as an output mapping?
{
"the_movavg":{
"moving_avg":{
"buckets_path": "the_sum",
"window" : 30,
"model" : "holt_winters",
"settings" : {
"type" : "mult",
"alpha" : 0.5,
"beta" : 0.5,
"gamma" : 0.5,
"period" : 7,
"pad" : true
}
}
}
This is my json file for logstash
{
"template" : "linux_cpu-*",
"settings" : {
"index.refresh_interval" : "5s"
},
"mappings" : {
"_default_" : {
"_all" : {"enabled" : true, "omit_norms" : true},
"dynamic_templates" : [ {
"message_field" : {
"match" : "message",
"match_mapping_type" : "string",
"mapping" : {
"type" : "string", "index" : "analyzed", "omit_norms" : true,
"fielddata" : { "format" : "disabled" }
}
}
}, {
"string_fields" : {
"match" : "*",
"match_mapping_type" : "string",
"mapping" : {
"type" : "string", "index" : "analyzed", "omit_norms" : true,
"fielddata" : { "format" : "disabled" },
"fields" : {
"raw" : {"type": "string", "index" : "not_analyzed", "ignore_above" : 256}
}
}
}
} ],
"properties" : {
"#timestamp": { "type": "date" },
"#version": { "type": "string", "index": "not_analyzed" },
"geoip" : {
"dynamic": true,
"properties" : {
"ip": { "type": "ip" },
"location" : { "type" : "geo_point" },
"latitude" : { "type" : "float" },
"longitude" : { "type" : "float" }
}
}
}
}
}
}
And is it possible to have it as a graph as to be shown in Kibana?

Elasticsearch wildcard search string not_analyzed

I am having problem with wildcard search on field that is not_analyzed string.
I am using td-agent (plugin-elasticsearch) -> elasticsearch (kibana). I tried setting mapping to not_analyzed because my field "tag" contains dots. Example of entry:
{
"_index": "logstash-2016.06.27",
"_type": "fluentd",
"_id": "AVWRR1tIYMKfwXgMeyTA",
"_score": null,
"_source": {
"app": "RECEIVER",
"thread": "139639914489600",
"severity": "INFO ",
"message": "FM version 0",
"tag": "beeeon.ant-2.ada_server",
"#timestamp": "2016-06-27T11:53:35+02:00"
},
"fields": {
"#timestamp": [
1467021215000
]
},
"sort": [
1467021215000
]
}
This current mapping of the document:
{
"simple-template" : {
"order" : 0,
"template" : "logstash-*",
"settings" : {
"index" : {
"number_of_shards" : "1",
"number_of_replicas" : "1"
}
},
"mappings" : {
"_default_" : {
"properties" : {
"app" : {
"index" : "analyzed",
"type" : "string"
},
"severity" : {
"index" : "analyzed",
"type" : "string"
},
"#timestamp" : {
"index" : "not_analyzed",
"type" : "date"
},
"thread" : {
"index" : "analyzed",
"type" : "string"
},
"tag" : {
"index" : "not_analyzed",
"type" : "string"
},
"message" : {
"index" : "not_analyzed",
"type" : "string"
}
}
}
},
"aliases" : { }
}
}
Note the field "tag", having values such as "beeeon.ant-2.ada_server" or "beeeon.iotdata.ada_server".
Searching for tag with query such as 'tag:"beeeon.ant-2.ada_server"' or 'tag:"beeeon.iotdata.ada_server"' everything works correctly and i see different entries from that source, problem arises when i try to perform wildcards search such as '*' or 'tag:"beeeon.*.ada_server"'. I expect to see entries from both hosts, though i see entries only from last one.
Thanks for any advice

Logstash issues in creating index remove .raw field in kibana

I have written a logstash conf filefor reading logs. If I use the default index, that is logstash-*, I could see .raw field in kibana. However, if I create a new index in conf file in logstash like
output{
elasticsearch {
hosts => "localhost"
index => "batchjob-*"}
}
Then the new index cant configure .raw field. Is there any resolve ways to solve it? Great Thanks.
The raw fields are created by a specific index template that the Logstash elasticsearch output creates in Elasticsearch.
What you can do is simply copy that template to a file named batchjob.json and change the template name to batchjob-* (see below)
{
"template" : "batchjob-*",
"settings" : {
"index.refresh_interval" : "5s"
},
"mappings" : {
"_default_" : {
"_all" : {"enabled" : true, "omit_norms" : true},
"dynamic_templates" : [ {
"message_field" : {
"match" : "message",
"match_mapping_type" : "string",
"mapping" : {
"type" : "string", "index" : "analyzed", "omit_norms" : true,
"fielddata" : { "format" : "disabled" }
}
}
}, {
"string_fields" : {
"match" : "*",
"match_mapping_type" : "string",
"mapping" : {
"type" : "string", "index" : "analyzed", "omit_norms" : true,
"fielddata" : { "format" : "disabled" },
"fields" : {
"raw" : {"type": "string", "index" : "not_analyzed", "ignore_above" : 256}
}
}
}
} ],
"properties" : {
"#timestamp": { "type": "date" },
"#version": { "type": "string", "index": "not_analyzed" },
"geoip" : {
"dynamic": true,
"properties" : {
"ip": { "type": "ip" },
"location" : { "type" : "geo_point" },
"latitude" : { "type" : "float" },
"longitude" : { "type" : "float" }
}
}
}
}
}
}
Then you can modify your elasticsearch output like this:
output {
elasticsearch {
hosts => "localhost"
index => "batchjob-*"
template_name => "batchjob"
template => "/path/to/batchjob.json"
}
}

Elasticsearch 1.7.3: doc_values treated as fielddata

I'm new to ElasticSearch, started working with ElasticSearch 1.7.3 as part of a Logstash-ElasticSearch-Kibana deployment.
I've defined a mapping template for my log messages, this is the interesting part:
{
"template" : "logstash-*",
"settings" : { "index.refresh_interval" : "5s" },
"mappings" : {
"_default_" : {
"_all" : {"enabled" : true, "omit_norms" : true},
"dynamic_templates" : [ {
"date_fields" : {
"match" : "*",
"match_mapping_type" : "date",
"mapping" : { "type" : "date", "doc_values" : true }
}
}],
"properties" : {
"#version" : { "type" : "string", "index" : "not_analyzed" },
"#timestamp" : { "type" : "date", "format" : "dateOptionalTime" },
"message" : { "type" : "string" }
}
} ,
"my_log" : {
"_all" : { "enabled" : true, "omit_norms" : true },
"dynamic_templates" : [ {
"date_fields" : {
"match" : "*",
"match_mapping_type" : "date",
"mapping" : { "type" : "date", "doc_values" : true }
}
}],
"properties" : {
"#timestamp" : { "type" : "date", "format" : "dateOptionalTime" },
"file" : { "type" : "string" },
"message" : { "type" : "string" }
"geolocation" : { "type" : "string" },
}
}
}
}
Although the #timestamp field is defined as doc_value:true I have an error of MemoryException because it is a fielddata:
[FIELDDATA] Data too large, data for [#timestamp] would be larger than
limit of [633785548/604.4 mb]
NOTE:
I know I can change the memory or add more nodes to the cluster, but in my point of view this is a design problem where this field should not be indexed in memory.

How to map geoip field in logstash with elasticsearch in order to display it in tile map of Kibana4

I'd like to display geoip fields in tile map of Kibana4.
Using the standard / automatic logstash geoip mapping to elasticsearch it all works fine.
However when creating a non-standard geoip field, I am not quite sure how to customize the elasticsearch-template.json in logstash in order to represent this field correctly in elasticsearch so that it can be chosen in Kibana4 for tile map creation.
Sure, customizing the standard template is not the best way - better create a custom template and point to it in elasticsearch output of logstash.conf. I just quickly wanted to check how the mapping has to be defined, so I modified the standard template.
My logstash.conf:
input {
tcp {
port => 514
type => syslog
}
udp {
port => 514
type => syslog
}
}
filter {
# Standard geoip field is automatically mapped by logstash to
# elastic search by using the elasticsearch-template.json file
geoip { source => "host" }
grok {
match => [
"message", "<%{POSINT:syslog_pri}>%{YEAR} %{SYSLOGTIMESTAMP:syslog_timestamp} %{DATA:device} <%{POSINT:status}> %{WORD:activity} %{DATA:inout} \(%{DATA:msg}\) Src:%{IPV4:src} SPort:%{INT:sport} Dst:%{IPV4:dst} DPort:%{INT:dport} IPP:%{INT:ipp} Rule:%{INT:rule} Interface:%{WORD:iface}",
"message", "<%{POSINT:syslog_pri}>%{YEAR} %{SYSLOGTIMESTAMP:syslog_timestamp} %{DATA:device} <%{POSINT:status}> %{WORD:activity} %{DATA:inout} \(%{DATA:msg}\) Src:%{IPV4:src} Dst:%{IPV4:dst} IPP:%{INT:ipp} Rule:%{INT:rule} Interface:%{WORD:iface}",
"message", "<%{POSINT:syslog_pri}>%{YEAR} %{SYSLOGTIMESTAMP:syslog_timestamp} %{DATA:device} <%{POSINT:status}> %{WORD:activity} %{DATA:inout} \(%{DATA:msg}\) Src:%{IPV4:src} Dst:%{IPV4:dst} Type:%{POSINT:type} Code:%{INT:code} IPP:%{INT:ipp} Rule:%{INT:rule} Interface:%{WORD:iface}"
]
}
# Is not mapped automatically by logstash in that it can be
# chosen in Kibana4 for tile map creation
geoip {
source => "src"
target => "src_geoip"
}
}
output {
elasticsearch {
host => "localhost"
protocol => "http"
}
}
My ...logstash-1.4.2\lib\logstash\outputs\elasticsearch\elasticsearch-template.json:
{
"template" : "logstash-*",
"settings" : {
"index.refresh_interval" : "5s"
},
"mappings" : {
"_default_" : {
"_all" : {"enabled" : true},
"dynamic_templates" : [ {
"string_fields" : {
"match" : "*",
"match_mapping_type" : "string",
"mapping" : {
"type" : "string", "index" : "analyzed", "omit_norms" : true,
"fields" : {
"raw" : {"type": "string", "index" : "not_analyzed", "ignore_above" : 256}
}
}
}
} ],
"properties" : {
"#version": { "type": "string", "index": "not_analyzed" },
"geoip" : {
"type" : "object",
"dynamic": true,
"path": "full",
"properties" : {
"location" : { "type" : "geo_point" }
}
},
"src_geoip" : {
"type" : "object",
"dynamic": true,
"path": "full",
"properties" : {
"location" : { "type" : "geo_point" }
}
}
}
}
}
}
UPDATE: I havent figured out yet when this json file gets applied in elasticsearch. I followed the hints outlined in this question and copied the json file to a config/templates folder in elasticsearch directory. After deleting the indizes and restart of elasticsearch, the template was applied successfully.
Anyway, the field "src_geoip.location" still does not show up in the tile map creation form of Kibana4 (only the standard geoip.location field does).
Try overwrite template after editing template. Re-create indexes in Kibana after config change.
output {
elasticsearch {
template_overwrite => "true"
...
}
}
You also need to add objects for the src_geoip object in the index template on your elasticsearch instance. To set the default template for all indexes that match "logstash-netflow-*", execute the following on your elasticsearch instance:
curl -XPUT localhost:9200/_template/logstash-netflow -d '{
"template" : "logstash-netflow-*",
"mappings" : {
"_default_" : {
"_all" : {
"enabled" : false
},
"properties" : {
"#timestamp" : { "index" : "analyzed", "type" : "date" },
"#version" : { "index" : "analyzed", "type" : "integer" },
"src_geoip" : {
"dynamic" : true,
"type" : "object",
"properties" : {
"area_code" : { "type" : "long" },
"city_name" : { "type" : "string" },
"continent_code" : { "type" : "string" },
"country_code2" : { "type" : "string" },
"country_code3" : { "type" : "string" },
"country_name" : { "type" : "string" },
"dma_code" : { "type" : "long" },
"ip" : { "type" : "string" },
"latitude" : { "type" : "double" },
"location" : { "type" : "double" },
"longitude" : { "type" : "double" },
"postal_code" : { "type" : "string" },
"real_region_name" : { "type" : "string" },
"region_name" : { "type" : "string" },
"timezone" : { "type" : "string" }
}
},
"netflow" : { ....snipped......
}
}
}
}}'

Resources