Kafka Ignite sink connector remote node data transfer

Kafka Ignite sink connector remote node data transfer - caching

I am trying to send data from Kafka to Ignite using Ignite Sink Connector. I have done little experiments :
When I am running Kafka and Ignite on same machine locally with connector, I am able to send the data. - In this case I have provided xml configuration file for ignite in connector.properties which includes CacheName and Discovery property.
When I am trying to run them on remote node and connector on Kafka server's node it is unable to push the data even if I change IP in discovery property. - In this case I am running ignite with xml configuration on other node with that node's ip via terminal shell script.
When I am running kafka and ignite on remote node's but connector on ignite side, it's able to pull from kafka and push into cache.
I am very new to Ignite. Please help me out with these doubts.
I am using same xml configuration file which comes with ignite setup called example-cache.xml
Why is it so?
Ideally, On which side worker and connector should run , Kafka or Ignite ? If I want to make them at kafka server only? What changes that I need to do?
Have I mistaken something to configure in xml configuration? If Yes, what should be the configurations that I should made in my ignite server xml file and in xml file which I pass in connector?

Related

Sending data from elasticsearch to kafka and finally to influxdb?

I would like to know how can I send data from elasticsearch to kafka and then to influxdb?
I've already tried using confluent platform with sources connector from elasticsearch and sink connector from influxdb, but the problem is that I'm stuck on sending data from elasticsearch to kafka
moreover once my computer is off I no longer have the backup of the connectors and I have to start from scratch
that's why my questions:
How to send data from elasticsearch to kafka? using confluent platform?
Do I really have to use confluent platform if I want to use kafka connect?

Kafka Connect is Apache 2.0 Licensed and is included with Apache Kafka download.
Confluent (among other companies) write plugins for it, such as Sinks to Elasticsearch or Influx.
It appears the Elasticsearch source on Confluent Hub is not built by Confluent, for example.
Related - Use Confluent Hub without Confluent Platform installation
once my computer is off I no longer have the backup of the connectors and I have to start from scratch
Kafka Connect distributed mode stores its config data in Kafka topics... Kafka defaults to store topic data in /tmp... Which is deleted when you shutdown your computer
Similarly, if you are using Docker for any of these systems without mounted volumes, Docker also is not persistent by default

How to write to multiple distinct Elasticsearch clusters using the Kafka Elasticsearch Sink Connector

Is is possible to use a single Kafka instance with the Elasticsearch Sink Connector to write to separate Elasticsearch clusters with the same index? Documentation. The source data may be a backend database or an application. An example use-case is that one cluster may be used for real-time search and the other may be used for analytics.
If this is possible, how do I configure the sink connector? If not, I can think of a couple of options:
Use 2 Kafka instances, each pointing to a different Elasticsearch cluster. Either write to both, or write to one and copy from it to the other.
Use a single Kafka instance and write a stream processor which will write to both clusters.
Are there any others?

Yes you can do this. You can use a single Kafka cluster and single Kafka Connect worker.
One connector can write to one Elasticsearch instance, and so if you have multiple destination Elasticsearch you need multiple connectors configured.
The usual way to run Kafka Connect is in "distributed" mode (even on a single instance), and then you submit one—or more—connector configurations via the REST API.
You don't need a Java client to use Kafka Connect - it's configuration only. The configuration, per connector, says where to get the data from (which Kafka topic(s)) and where to write it (which Elasticsearch instance).
To learn more about Kafka Connect see this talk, this short video, and this specific tutorial on Kafka Connect and Elasticsearch

Spring boot connect to alibaba e-mapreduce kafka

I'm trying to connect spring boot kafka app to kafka on alibaba cloud.
The cloud is on e-mapreduce service.
However, I can't connect from boot, maybe due to some security credential that I need to provide?
I've already tried to set the boot properties as follows:
spring.kafka.properties.security.protocol=SSL
Get error : Connection to node -1 (/xx.xx.xx.xx:9092) terminated during authentication. This may happen due to any of the following reasons: (1) Authentication failed due to invalid credentials with brokers older than 1.0.0, (2) Firewall blocking Kafka TLS traffic (eg it may only allow HTTPS traffic), (3) Transient network issue.
spring.kafka.properties.security.protocol=SASL_SSL
Throws Caused by: java.lang.IllegalArgumentException: Could not find a 'KafkaClient' entry in the JAAS configuration. System property 'java.security.auth.login.config' is not set
Anybody has experience connect to kafka on alibaba cloud?

I believe Kafka Connect could solve your problems of connect spring boot kafka app to kafka on Alibaba cloud:
Step 1: Create Kafka clusters
Create a source Kafka cluster and a target Kafka cluster in E-MapReduce.
Step 2: Create a topic for storing the data to be migrated
Create a topic named connect in the source Kafka cluster.
Step 3: Create a Kafka Connect connector
Use Secure Shell (SSH) to log on to the header node of the source Kafka cluster.
Optional:Customize Kafka Connect configuration.
Step 4: View the status of the Kafka Connect connector and task node
View the status of the Kafka Connect connector and task node and make sure that they are in normal status.
Follow other steps as your job needs are.
Detail instructions may be find on Use Kafka Connect to migrate data link: https://www.alibabacloud.com/help/doc-detail/127685.htm
Hope this will help you,

Spring Boot ZooKeeper client

I want to use ZooKeeper in order to synchronize my distributed services via ZooKeeper ephemeral nodes.
The idea is the following - every node in the topology on the startup will create ZooKeeper session and ephemeral nodes. On the node restart or failure, these nodes will disappear.
I'm going to implement it using Spring Boot. Right now I'm in doubt what project and Maven dependency to use in order to have ZooKeeper client autoconfiguration, be able to create ZooKeeper session on the application startup, be able to create from this client - ZooKeeper ephemeral nodes and use ZooKeeper transactions.
Right now I'm looking on Spring Cloud Zookeeper/ but I'm not sure is it a right one for this purpose. Could you please point me to the right Spring Boot ZooKeeper project and show the small example how to achieve that I have described above.

Kafka-Connect vs Filebeat & Logstash

I'm looking to consume from Kafka and save data into Hadoop and Elasticsearch.
I've seen 2 ways of doing this currently: using Filebeat to consume from Kafka and send it to ES and using Kafka-Connect framework. There is a Kafka-Connect-HDFS and Kafka-Connect-Elasticsearch module.
I'm not sure which one to use to send streaming data. Though I think that if I want at some point to take data from Kafka and place it into Cassandra I can use a Kafka-Connect module for that but no such feature exists for Filebeat.

Kafka Connect can handle streaming data and is a bit more flexible. If you are just going to elastic, Filebeat is a clean integration for log sources. However, if you are going from Kafka to a number of different sinks, Kafka Connect is probably what you want. I'd recommend checking out the connector hub to see some examples of open source connectors at your disposal currently http://www.confluent.io/product/connectors/

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Kafka Ignite sink connector remote node data transfer - caching

Related

Sending data from elasticsearch to kafka and finally to influxdb?

How to write to multiple distinct Elasticsearch clusters using the Kafka Elasticsearch Sink Connector

Spring boot connect to alibaba e-mapreduce kafka

Spring Boot ZooKeeper client

Kafka-Connect vs Filebeat & Logstash

Categories

Resources