Logstash: create new event in filter - events

When filtering events in logstash (20+ attributes) i would like to create new event which would have one parameter from original event and store it into other ElastiSearch index.
I know this is possible using clone filter plugin. But i don't want to manually remove all attributes from original events except the one I need.
Also i could just clone that event (i'm will store new event in separate elasticsearch index) but that will duplicate unneeded attributes.
Is there any filter plugin for this purpose? Or some hidden feature? Or maybe clone filter plugin handles removal of all attributes from cloned messages?

ElastAlert is a simple framework for alerting on anomalies, spikes, or other patterns of interest from data in Elasticsearch.
http://elastalert.readthedocs.io/en/latest/elastalert.html

Related

How to handle dynamic index creation in Elasticsearch using Apache NiFi?

I am routing data through to Elasticsearch using Nifi. I'm using NiFi to dynamically create indices based on a set of attributes. I'm using Index Lifecycle Policy Management in Elasticsearch which requires all indices to be manually bootstrapped beforehand for ILM settings to be applied. Since my NiFi flow automatically ingests messages into Elasticsearch any index created automatically will not have have ILM policies applied.
Currently my flow is Nifi Consume from Kafka --> Update Attribute --> PutElasticsearch Record.
A solution (I think) would be to call the invokehttp processor in front of the PutElasticsearch processor to bootstrap the indices dynamically via the attributes extracted before ingesting into elasticsearch. Indices are dynamically created using the syntax: index_${attribute_1}_${attribute_2}. My only concern here is the invoke invokehttpprocessor would run with every new flowfile. This could be thousands of calls to bootstrap an index. And if the index already exists there could be collision there.
Is this really the best way to do this? Perhaps I could run the QueryElasticsearchRecord processor to get a list of indices and somehow match that against incoming flowfiles on the attribute_1 and attribute_2 field. But that would still require a continuous query, I think?
What you could do is have the InvokeHTTP run if and only if it sees a specific value or attribute that would signal that a new (previously unsent) index value to input into ElasticSearch is required. Just an idea if you want to head down that route.

Copy documents in another index on creation in Elasticsearch

We want to keep track of all the changes of a document, so we want to store all the document versions in separate index.
Is there a way when a new document is added or changes to send the entire document in another index? Maybe there is a processor for this use case?
As far as I know, Elasticsearch as such supports only version numbers but there is no way to trace back to previous version.
You could maintain version history in a seperate elastic index
Whenever you update main_index ensure that you update main_index as well
POST main_index/_doc/doc_id
POST main_index/_doc/doc_id_version
May be you can configure logstash to do this...not sure

Is Event-Driven Programming possible using Logstash?

I have an event that I want to take an action off of when it happens. This event is being processed by Logstash and pushed to Elastic Search.
I see in the docs for Logstash that there are Filters and Outputs. I thought the Filter might have been what I was after, but looking closer at the docs it does not seem to have the functionality to push data to an endpoint/API: Filters are intermediary processing devices in the Logstash pipeline. You can combine filters with conditionals to perform an action on an event if it meets certain criteria.
Is it possible to take an action when the event is being processed by Logstash 7.4? Or do I need to wait to take some type of action until I do some polling-based queries for the event in ElasticSearch?

How do I exclude/predefine fields for Index Patterns in Kibana?

I am using ELK to monitor REST API servers. Logstash decomposes the URL into a JSON object with fields for query parameters, header params, request duration, headers.
TLDR: I want all these fields retained so when I look at a specific message, I can see all the details. But only need a few of them to query and generate reports/visualizations in Kibana.
I've been testing for a few weeks and adding some new fields on the server side. So whenever I do, I need to rescan the index. However the auto-detection now finds 300+ fields and I'm guessing it indexes all of them.
I would like to control it to just index a set of fields as I think the more it detects, the larger the index file gets?
It was about 300MB/day for a week (100-200 fields), and then when I added a new field I needed to refresh, it went to 350 fields; 1 GB/day. After I accidentally deleted the ELK instance yesterday, I redid everything and now the indexes are like 100MB/day so far which is why I got curious.
I found these docs but not sure which one's are relevant or how they relate/need to be put together.
Mapping, index patterns, indices, templates/filebeats/rollup policy
https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping.html
https://discuss.elastic.co/t/index-lifecycle-management-for-existing-indices/181749/3
https://www.elastic.co/guide/en/elasticsearch/reference/7.3/indices-templates.html
(One has a PUT call that sends a huge JSON text but not sure how you would enter something like that in putty. POSTMAN/JMeter maybe but these need to be executed on the server itself which is just an SSH session, no GUI/text window.)
To remove fields from your log (since you are using logstash), you can use remove_field option of logstash mutate filter.
Ref: Mutate filter plugin

Updating document and adding new field in elastic search

We have usecase that data will be updated daily. Some of attributes of document changes and some of new record is there. Is it possible to reindex data with updated value, which is already there and add new reocord.
if yes, please explain how.
Is it with update API?
I am indexing like this
String json = getJsonMapper().writeValueAsString(data);
bulkRequestBuilder.add(getClient().prepareIndex(indexName, typeName).setSource(json));
I am not passing any id. How can i update this. What is best way
Elasticsearch uses Apache Lucene underneath the covers. In Lucene documents are immutable.
You can use the Update API for your use case. This API does a delete and save underneath but that doesn't concern you. You can even update a part of the document, which means that Elasticsearch will retrieve the old document, generate the new one, delete the old one and save the new one.
The problem is that for all this to work is that you need to use the same id. If you don't then Elasticsearch will generate one for you if you use the Index API. This means that it will be saved as a new document.
The Update API needs the id, otherwise it doesn't know what to update.

Resources