Logstash Elasticsearch compression - elasticsearch

I have a working ELK stack and would like to enable index compression.
The official store compression documentation tells me that I need to do it at index creation.
I couldn't find anything related to store compression or even index settings in the related logstash output documentation
Below is my logstash output configuration:
output {
elasticsearch {
hosts => [ "localhost:9200" ]
sniffing => true
manage_template => false
index => "%{[#metadata][beat]}-%{+YYYY.MM.dd}"
document_type => "%{[#metadata][type]}"
}
}
And the created index settings:
{
"filebeat-2016.04.28": {
"settings": {
"index": {
"creation_date": "1461915752875",
"uuid": "co8bvXI7RFKFwB7oJqs8cA",
"number_of_replicas": "1",
"number_of_shards": "5",
"version": {
"created": "2030199"
}
}
}
}
}

You need to provide your own index template file in order to enable index compression.
So you need to create your filebeat-template.json file like this. This file will be used by logstash when creating a new filebeat index.
{
"template" : "filebeat-*",
"settings" : {
"index.codec" : "best_compression"
}
}
Then your elasticsearch output should be modified like this:
output {
elasticsearch {
hosts => [ "localhost:9200" ]
sniffing => true
index => "%{[#metadata][beat]}-%{+YYYY.MM.dd}"
document_type => "%{[#metadata][type]}"
template_name => "filebeat-template"
template => "/path/to/filebeat-template.json"
}
}
Then you can delete your existing filebeat-2016.04.28 index and relaunch logstash. The latter will create an index template called /_template/filebeat-template which will kick in everytime ES needs to create a new index whose name starts with filebeat- and it will apply the settings (among which the store compression one) present in the template.

Related

Logstash - Send output from log files to elk

I have an index in elastic search that has a field named locationCoordinates. It's being sent to ElasticSearch from logstash.
The data in this field looks like this...
-38.122, 145.025
When this field appears in ElasticSearch it is not coming up as a geo point.
I know if I do this below it works.
{
"mappings": {
"logs": {
"properties": {
"http_request.locationCoordinates": {
"type": "geo_point"
}
}
}
}
}
But what I would like to know is how can i change my logstash.conf file so that it does this at startup.
At the moment my logstash.conf looks a bit like this...
input {
# Default GELF input
gelf {
port => 12201
type => gelf
}
# Default TCP input
tcp {
port => 5000
type => syslog
}
# Default UDP input
udp {
port => 5001
type => prod
codec => json
}
file {
path => [ "/tmp/app-logs/*.log" ]
codec => json {
charset => "UTF-8"
}
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
json{
source => "message"
}
}
output {
elasticsearch {
hosts => "elasticsearch:9200"
}
}
And I end up with this in Kibana (without the little Geo sign).
You simply need to modify your elasticsearch output to configure an index template in which you can add your additional mapping.
output {
elasticsearch {
hosts => "elasticsearch:9200"
template_overwrite => true
template => "/path/to/template.json"
}
}
And then in the file at /path/to/template.json you can add your additional geo_point mapping
{
"template": "logstash-*",
"mappings": {
"logs": {
"properties": {
"http_request.locationCoordinates": {
"type": "geo_point"
}
}
}
}
}
If you want to keep the official logstash template, you can download it and add your specific geo_point mapping to it.

add custom mapping for elasticsearch in logstash

I am using logstash to input my logs in elasticsearch. Everyday, it create a new index
here is my output part of my logstash config file
output {
stdout { codec => rubydebug }
elasticsearch {
hosts => ["127.0.0.1"]
index => "logstash-%{+YYYY.MM.dd}"
}
}
I want some fields to be not analysed. But everyday when a new index is created, a new mapping is created and all the fields are analysed. How can I force elasticsearch to use a particular mapping every time a new index is created?
You can do this by assigning templates and managing them, for example my configuration:
elasticsearch {
hosts => ["localhost:9200"]
index => "XXX-%{+YYYY.ww}"
template => "/opt/logstash/templates/XXX.json"
template_name => "XXX"
manage_template => true
}
I believe my configuration may be slightly out of date, as we are sadly on an older version of logstash ... So it would be helpful to read up on this on the docs: https://www.elastic.co/guide/en/logstash/current/plugins-outputs-elasticsearch.html
This is definitely possible inside logstash though.
Artur
You can use a ES index template, which then will be used when creating an index: https://www.elastic.co/guide/en/elasticsearch/reference/2.4/indices-templates.html.
In your case the template would look like this:
{
"template": "logstash-*",
"mappings": {
"_default_": {
...
}
}
}

Dynamic Index in ElasticSearch from Logstash

I have following configuration in logstash whereby I am able to create dynamic "document_type" into ES based on input JSON received:
elasticsearch {
hosts => ["localhost:9200"]
index => "queuelogs"
document_type => "%{action}"
}
Here, "action" is the parameter that I receive in JSON and different document_type gets created as per different action received.
Now I want this to be done same for Index creation, such as following:
elasticsearch {
hosts => ["localhost:9200"]
index => "%{logtype}"
document_type => "%{action}"
}
Here, "logtype" is the parameter that I receive in JSON.
But somehow in ES, it creates index as "%{logtype}" only, not as per actual logtype value .
The input JSON is as following:
{
"action": "UPLOAD",
"user": "123",
"timestamp": "2016 Jun 14 12:00:12",
"data": {
"file_id": "2345",
"file_name": "xyz.pdf"
},
"header": {
"proj_id": "P123",
"logtype": "httplogs"
},
"comments": "Check comments"
}
Here, I tried to generate index in following ways:
index => "%{logtype}"
index => "%{header.logtype}"
But in both the cases, Logstash does not replace the actual value of logtype from JSON.
You need to specify it like this:
elasticsearch {
hosts => ["localhost:9200"]
index => "%{[header][logtype]}"
document_type => "%{action}"
}

How to move data from one Elasticsearch index to another using the Bulk API

I am new to Elasticsearch. How to move data from one Elasticsearch index to another using the Bulk API?
I'd suggest using Logstash for this, i.e. you use one elasticsearch input plugin to retrieve the data from your index and another elasticsearch output plugin to push the data to your other index.
The config logstash config file would look like this:
input {
elasticsearch {
hosts => "localhost:9200"
index => "source_index" <--- the name of your source index
}
}
filter {
mutate {
remove_field => [ "#version", "#timestamp" ]
}
}
output {
elasticsearch {
host => "localhost"
port => 9200
protocol => "http"
manage_template => false
index => "target_index" <---- the name of your target index
document_type => "your_doc_type" <---- make sure to set the appropriate type
document_id => "%{id}"
workers => 5
}
}
After installing Logstash, you can run it like this:
bin/logstash -f logstash.conf

How to stop logstash from creating a default mapping in ElasticSearch

I am using logstash to feed logs into ElasticSearch.
I am configuring logstash output as:
input {
file {
path => "/tmp/foo.log"
codec =>
plain {
format => "%{message}"
}
}
}
output {
elasticsearch {
#host => localhost
codec => json {}
manage_template => false
index => "4glogs"
}
}
I notice that as soon as I start logstash it creates a mapping ( logs ) in ES as below.
{
"4glogs": {
"mappings": {
"logs": {
"properties": {
"#timestamp": {
"type": "date",
"format": "dateOptionalTime"
},
"#version": {
"type": "string"
},
"message": {
"type": "string"
}
}
}
}
}
}
How can I prevent logstash from creating this mapping ?
UPDATE:
I have now resolved this error too. "object mapping for [logs] tried to parse as object, but got EOF, has a concrete value been provided to it?"
As John Petrone has stated below, once you define a mapping, you have to ensure that your documents conform to the mapping. In my case, I had defined a mapping of "type: nested" but the output from logstash was a string.
So I removed all codecs ( whether json or plain ) from my logstash config and that allowed the json document to pass through without changes.
Here is my new logstash config ( with some additional filters for multiline logs ).
input {
kafka {
zk_connect => "localhost:2181"
group_id => "logstash_group"
topic_id => "platform-logger"
reset_beginning => false
consumer_threads => 1
queue_size => 2000
consumer_id => "logstash-1"
fetch_message_max_bytes => 1048576
}
file {
path => "/tmp/foo.log"
}
}
filter {
multiline {
pattern => "^\s"
what => "previous"
}
multiline {
pattern => "[0-9]+$"
what => "previous"
}
multiline {
pattern => "^$"
what => "previous"
}
mutate{
remove_field => ["kafka"]
remove_field => ["#version"]
remove_field => ["#timestamp"]
remove_tag => ["multiline"]
}
}
output {
elasticsearch {
manage_template => false
index => "4glogs"
}
}
You will need a mapping to store data in Elasticsearch and to search on it - that's how ES knows how to index and search those content types. You can either let logstash create it dynamically or you can prevent it from doing so and instead create it manually.
Keep in mind you cannot change existing mappings (although you can add to them). So first off you will need to delete the existing index. You would then modify your settings to prevent dynamic mapping creation. At the same time you will want to create your own mapping.
For example, this will create the mappings for the logstash data but also restrict any dynamic mapping creation via "strict":
$ curl -XPUT 'http://localhost:9200/4glogs/logs/_mapping' -d '
{
"logs" : {
"dynamic": "strict",
"properties" : {
"#timestamp": {
"type": "date",
"format": "dateOptionalTime"
},
"#version": {
"type": "string"
},
"message": {
"type": "string"
}
}
}
}
'
Keep in mind that the index name "4glogs" and the type "logs" need to match what is coming from logstash.
For my production systems I generally prefer to turn off dynamic mapping as it avoids accidental mapping creation.
The following links should be useful if you want to make adjustments to your dynamic mappings:
https://www.elastic.co/guide/en/elasticsearch/guide/current/dynamic-mapping.html
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/custom-dynamic-mapping.html
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/dynamic-mapping.html
logs in this case is the index_type. If you don't want to create it as logs, specify some other index_type on your elasticsearch element. Every record in elasticsearch is required to have an index and a type. Logstash defaults to logs if you haven't specified it.
There's always an implicit mapping created when you insert records into Elasticsearch, so you can't prevent it from being created. You can create the mapping yourself before you insert anything (via say a template mapping).
The setting manage_template of false just prevents it from creating the template mapping for the index you've specified. You can delete the existing template if it's already been created by using something like curl -XDELETE http://localhost:9200/_template/logstash?pretty
Index templates can help you. Please see this jira for more details. You can create index templates with wildcard support to match an index name and put your default mappings.

Resources