Running awk in logstash - elasticsearch

I do not have the ability to do much but receive unstructured syslogs from Kafka which have been produced with logstash.
When I attach logstash as a consumer, these syslogs are all over the place and contain half a dozen patterns or more which very wildly. This is something more fitting to be run somehow streamed with an awk filter since the programmatic approach to passing incoming messages is actually quite sttisghtforward with such a tool.
Does anyone have any input on how one could attach a consumer to a Kafka topic and procure incoming logs and ship these logs in am intelligent way towards an elasticsearch clister?

Try to use grok expressions in your LOGSTASH config to parse the logs https://logz.io/blog/logstash-grok/ . This should allow you to filter, transform or drop data.
Or use something like CRIBL in between KAFKA and ELASTIC https://docs.cribl.io/stream/about/
Note on the CRIBL page how under sources KAFKA is one of the supported sources and ELASTIC is one of the supported destinations. This should allow to transform your data before ingesting it into ELASTIC.

Related

Can logstash perform statistical analysis on the data coming from filebeat?

OK,my problem is if it is possible to use logstash to perform statistical analysis on the collected log data.Now I have used filebeat to collect nginx logs into the es cluster and put the required labels on these logs.I plan to read these logs from the es cluster and write a program to make statistics on these logs, such as the traffic in a certain region for a period of time.Now, I want to know whether the logs collected by filebeat can be transferred to logstash for data statistics.
After a short period of research, I haven't found that logstash has this function. I hope you can help me.Thanks.
I want to know whether logstash can realize the functions I need
Logstash is basically ingests, transforms, and ships your data regardless of format or complexity. Derive structure from unstructured data with grok, decipher geo coordinates from IP addresses, anonymize or exclude sensitive fields, and ease overall processing. For I don't know your specific use case, but you can use Kibana for analyzing statistical data. You even don't need logstash if you have a node in your elasticsearch cluster with ingest node roles.

For Kafka sink Connector I send a single message to multiple indices documents in elasticseach?

I am recieving a very complex json inside a topic message, so i want to do some computations with it using SMTs and send to different elasticsearch indice documents. is it possible?
I am not able to find a solution for this.
The Elasticsearch sink connector only writes to one index, per record, based on the topic name. It's explicitly written in the Confluent documentation that topic altering transforms such as RegexRouter will not work as expected.
I'd suggest looking at logstash Kafka input and Elasticsearch output as an alternative, however, I'm still not sure how you'd "split" a record into multiple documents there either.
You may need an intermediate Kafka consumer such as Kafka Streams or ksqlDB to extract your nested JSON and emit multiple records that you expect in Elasticsearch.

Elasticsearch + Logstash - Indexing data from different sources when receiving an event

Good day,
I have a gotten into a bit of a headache when working on indexing some data in Elasticsearch and have some questions about a good approach.
As of now, an event is received on a Kafka topic with just a part of the data that should be stored in the document. The rest of the data needs to be collected after the event is received and is available from different APIs. To reduce the amount of work, it seems that Logstash could be a good approach.
Is there a way to configure Logstash to initiate data collection from different APIs and DBs when an event is received, and then assemble the document with the combined date, or am I stuck with writing time consuming custom logic for the indexing? I have searched around a bit, but couldn't find any good answer on the problem.
What you need in logstash is to lookup/enrich you message with data from external api's, right?
You could use logstash's http_filter plugin

Transfer data from elasticsearch to kafka

I want to transfer data of certain conditions from elasticsearch to kafka.
Is there any way to do that?
I use logstash for transferring data from elasticsearch to kafka finally.
logstash is alse a common framework for ingesting transforming and stashing data, which offers a variety of input and output plugins including elasticsearch input and kafka output.
Besides, elasticsearch input plugins also provide a schedule mechanism, which is very convenient for ingesting incremental data.

Topic mapping when streaming from Kafka to Elasticsearch

When I transfer or stream two and three tables then I can easily map in Elasticsearch but can I map automatically map topics to index
I have streamed data from PostgreSQL to ES by mapping manually topic.index.map=topic1:index1,topic2:index2, etc.
Can I map automatically whatever topics send by producer then consumer consume in ES connector automatically?
By default, the topics map directly to an index of the same name.
If you want "better" control, you can use RegexRouter in a transforms property
To quote the docs
topic.index.map
This option is now deprecated. A future version may remove it completely. Please use single message transforms, such as RegexRouter, to map topic names to index names
If you cannot capture a single regex for each topic in the connector, then run more connectors with a different pattern

Resources