How to move unique Ids from Kafka to elastic search? - elasticsearch

How to move unique Ids from Kafka to elastic search ?
I have done using Elasticsearch connector from kafka to Elasticsearch But it is sending the entire Data. I need to send only the Unique Ids from kafka to ES

Related

For Kafka sink Connector I send a single message to multiple indices documents in elasticseach?

I am recieving a very complex json inside a topic message, so i want to do some computations with it using SMTs and send to different elasticsearch indice documents. is it possible?
I am not able to find a solution for this.
The Elasticsearch sink connector only writes to one index, per record, based on the topic name. It's explicitly written in the Confluent documentation that topic altering transforms such as RegexRouter will not work as expected.
I'd suggest looking at logstash Kafka input and Elasticsearch output as an alternative, however, I'm still not sure how you'd "split" a record into multiple documents there either.
You may need an intermediate Kafka consumer such as Kafka Streams or ksqlDB to extract your nested JSON and emit multiple records that you expect in Elasticsearch.

When I search in Elasticsearch, Elasticsearch send RESTAPI to original db?

When I search in Elasticsearch, Elasticsearch send RESTAPI to original db?
Or Elasticsearch have orginal data?
I find Elasticsearch have indexing data. But I can't certain Elasticsearch have original data.
Elasticsearch is a database itself, so if you want some external data source to be in Elasticsearch (e.g: SQL Database) you need to index the data into Elasticsearch first, and then search against that data.
So no, the REST Api will not query against the original DB but against the data you have in Elasticsearch.
You can read more about the process here:
https://www.elastic.co/guide/en/cloud/current/ec-getting-started-search-use-cases-db-logstash.html

BIgquery to elasticsearch (Avoid adding duplicate documents to elasticsearch.)

I am trying to sync data between Bigquery and elasticsearch using the job template provided in GCP. The issue is that Bigquery sends all the documents everytime the job is run, now as elasticsearch has the document id as _id ,it creates duplicate documents.
Is there a way by which we can configure data _id field while sending data from bigquery to elasticsearch.

How to put two KSQLDB tables in the same index in Elasticsearch

I have two tables in KSQLDB that I want to put in the same index in Elasticsearch
But the Elasticsearch Service Sink Connector for Confluent Platform does not support
topic changes like:
io.confluent.connect.transforms.ExtractTopic$Key
io.confluent.connect.transforms.ExtractTopic$Value
as seen in the documentation https://docs.confluent.io/kafka-connect-elasticsearch/current/overview.html#limitations
Are there other ways of doing it?

Confluent Elasticsearch Sink connector, write.method : "UPSERT" on different key

In COnfluent Elasticsearch Sink connector, I am trying to write in same Elasticsearch index from two different topics. First topic is INSERT and another topic is UPSERT. For UPSERT, I want to update the JSON document based on some other field instead of "_id". IS that possible ? If yes, How can I do that ?
Use key.ignore=false and use existing primary key columns as _id for each json document.

Resources