Logstash Elasticsearch Output with specific fields to index - elasticsearch

Is there possibility for logstash to ingest whole document to one index and specific fields to other index in the elastic search output.
Example: If i have 10 fields coming in from input , can i write all the fields to one index and some specific fields to other index.
my current logstash is like
input{
kafka input
}
filter{
fingerprint id will be generated using some fields
}
output{
elastic{
whole document should go into this index
}
elastic{
Only the fingerprint field should go into this index
}
}

Related

logstash restrict search result to past day

I want to query Elasticsearch for an index a day before current date in Logstash using Elasticsearch input plugin.
I tried the following config for logstash,
input {
elasticsearch
{
hosts => ["localhost:9200"]
index => "logstash-%{+YYYY.MM.dd-6}"
query => '{ "query": { "query_string": { "query": "*" } } }'
size => 500
scroll => "5m"
docinfo => true
}
}
output { stdout { codec => rubydebug }
}
Can someone help me on how to do it?
You can use Date math index name within your elastic search query,
Date math index name resolution enables you to search a range of
time-series indices, rather than searching all of your time-series
indices and filtering the results or maintaining aliases. Limiting the
number of indices that are searched reduces the load on the cluster
and improves execution performance. For example, if you are searching
for errors in your daily logs, you can use a date math name template
to restrict the search to the past two days.
Almost all APIs that have an index parameter, support date math in the
index parameter value.
for instance to search for indices for yesterday, assuming the index use the default Logstash index name format, logstash-YYYY.MM.dd
GET /<logstash-{now/d-1d}>/_search

elastic stack : i need set Time Filter field name with another field

i need read messages(content is logs) from rabbitMq by logstash and then send that to elasticsearch for make visualize monitoring in kibana. so i wrote input for read from rabbitmq in logstash like this:
input {
rabbitmq {
queue => "testLogstash"
host => "localhost"
}
}
and i wrote output configuration for store in elasticsearch in logstash like this:
output {
elasticsearch{
hosts => "http://localhost:9200"
index => "d13-%{+YYYY.MM.dd}"
}
}
Both of them are placed in myConf.conf
In the content of each message, there is a Json that contains the fields like this:
{
"mDate":"MMMM dd YYYY, HH:mm:ss.SSS"
"name":"test name"
}
But there are two problems. First, there is no date field in the field of creating a new index(Time Filter field name). Second, I use the same timestamp as the default #timestamp, this field will not be displayed in the build type of graphs. I think the reason for this is because of the data type of the field. The field is of type date, but the string is considered.
i try to convert value of field to date by mutate in logstash config like this:
filter {
mutate {
convert => { "mdate" => "date" }
}
}
Now, two questions arise:
1- Is this the problem? If yes What is the right solution to fix it?
2- My main need is to use the time when messages are entered in the queue, not when Logstash takes them. What is the best solution?
If you don't specify a value for #timestamp, you should get the current system time when elasticsearch indexes the document. With that, you should be able to see items in kibana.
If I understand you correctly, you'd rather use you mDate field for #timestamp. For this, use the date{} filter in logstash.

How to control number of shards in Elastic index from logstash?

I would like to control how many shards a new index should have in my logstash output file. Ex:
10-output.conf:
output {
if [type] == "mytype" {
elasticsearch {
hosts => [ "1.1.1.1:9200" ]
index => "logstash-mytype-%{+YYYY.ww}"
workers => 8
flush_size => 1000
? <====== what option to control the number of index shards goes here?
}
}
From what I understand in logstash elastic options this is not possible and new index will default to 5 shards?
The Logstash-Elasticsearch mix it's designed to work differently than what your expectation is: in Elasticsearch one defines an index template in which the number or shards is a configuration setting.
And whenever Logstash creates a new index by sending documents to this new index, Elasticsearch uses that index template (by matching the new index name with the configured template) to actually create the index.

Searching or filtering on a multi value field

I have a MySQL database with a table that contains 2 importants fields title and age_range.
That table saves documents like this '45;60' for documents designed for users between 45 and 60 years old, '18;70' for users between 18 and 70 years old and so on...
Now I would like to fire the query 'test' on the field title with the filter '18;50' for the field age_range that will return all documents matching 'test' with the age range field contained in this interval including the 2 cases above for example.
For instance, I use Logstash to index my data.
How can I achieve this?
Any treatment to do while indexing my data with logstash?
Any filter, tokenizer to use while indexing using ES analyzer?
Thank you in advance
You can split the data as two fields with grok filter. To ship data to Elasticsearch, you can use logstash jdbc_streaming input and elasticsearch output firstly. And you configure your input like below:
input {
jdbc_streaming {
# Configuration of jdbc
# https://www.elastic.co/guide/en/logstash/current/plugins-filters-jdbc_streaming.html
}
}
filter {
# Split the field as separeted two fields
grok {
match => { "age_range" => "%{NUMBER:age_range_top};%{NUMBER:age_range_bottom}" }
}
}
output {
elasticsearch {
# elasticsearch output configuration
}
}
Analysis depends on your search method. How can you want to search these fields. Range fields is necessary default one if you want to do only range filter. But you should do some work about title. For example you can follow this example to handle autocomplete.

Elasticsearch converting a string to number

I am new to Elasticsearch and am just starting up with ELK stack. I am collecting key value type logs in my Logstash and passing it to an index in Elasticsearch. I am using the kv filter plugin in Logstash. Due to this, all the fields are string type by default.
When I try to perform aggregation like avg or sum on a numeric field in Elasticsearch, I am getting an Exception: ClassCastException[org.elasticsearch.index.fielddata.plain.PagedBytesIndexFieldData cannot be cast to org.elasticsearch.index.fielddata.IndexNumericFieldData]
When I check the mappings in the index, all the fields except the timestamp ones are marked as string.
Please tell me how to overcome this issue as I have many numeric fields in my log events for aggregation.
Thanks,
Keerthana
You could set explicit mappings for those fields (see e.g. Change default mapping of string to "not analyzed" in Elasticsearch for some guidance), but it's easier to just convert those fields to integers in Logstash using the mutate filter:
mutate {
convert => ["name-of-field", "integer"]
}
Then Elasticsearch will do a better job at guessing the best data type for your field(s).
(See also Data type conversion using logstash grok.)
In latest Logstash the syntax is as follows
filter {
mutate {
convert => { "fieldname" => "integer" }
}
}
You can visit this link for more detail: https://www.elastic.co/guide/en/logstash/current/plugins-filters-mutate.html#plugins-filters-mutate-convert

Resources