How do I create multiple tasks in Kafka Source Connector? - apache-kafka-connect

I'm new to Apache Kafka and I'm trying to create multiple tasks, each one with a separate purpose. But the source connector method taskClass() returns only 1 task class. So how do I create more ?

Deploy multiple connector instances, using the plugin you require. You have gone too low-level looking at taskClass() etc. If you can't get the data from multiple sources with a single connector configuration, just create additional connector configurations. Each is just config files. One connector = one task (or more, if scaled out).

Related

Using multiple file suppliers in Spring Cloud Stream

I have an application with file-supplier Srping Cloud module included. My workflow is like track file creating/modifying in particular directory->push events to Kafka topic if there are such events. FileSupplierConfiguration is used to configure this file supplier. But now I have to track one more directory and push events to another relevant Kafka topic. So, there is an issue, because there is no possibility to include multiple FileSupplierConfiguration in project for configuration another file supplier. I remember that one of the main principles of microservices for which spring-cloud-stream was designed for is you do one thing and do it well without affecting others, but it still the same microservice with same tracking/pushing functionality but for another directory and topic. Is there any possibility to add one more file supplier with relevant configuration with file-supplier module? Or is the best solution for this issue to run one more application instance with another configuration?
Yes, the best way is to have another instance of this application, but with its own specific configuration properties. This is really how these functions have been designed: microservice based on the convention on configuration. What you are asking really contradicts with Spring Boot expectations. Imaging you'd need to connect to several data bases. So, you only can have a single JdbcTemplate auto-configured. The rest is only possible manually. But better to have the same code base which relies on the auto-configuration and may apply different props.

Can the empty string fields ignored in Kafka coneect sink

I am using Kafka connect to sink data into elastic search. Usually, we ignore the empty fields when persisting to the elastic search. Can we do the same using Kafka connect?
Sample input
{"field1":"1","field2":""}
In the elastic index
{"field1":"1"}
In kafka connect there is a term called SMT, single message transformation, it has several types of supported functions, but none of them doing what you wish for, you can however write your on SMT doing that action ,
● Create your JAR file.
● Install the JAR file. Copy your custom SMT JAR file (and any non-Kafka JAR files required by the transformation) into a directory that is under one of the directories listed in the plugin.path property in the Connect worker configuration
refer to further instructions
https://docs.confluent.io/platform/current/connect/transforms/custom.html

How do I share between applications the auto-generated SpecificRecord POJO by avro-maven-plugin

I'm new to Kafka and Avro, I have been reading how Avro provides a nice maven utility avro-maven-plugin to generate Avro POJO records, also I understand that using the schema registry is the way to have a centralized place and to share schema evolutions across multiple applications, most consumers.
All the examples that I have been looking at, have the consumer and producer in the same Java application using the same generated SpecificRecord, that is fine for examples, however, I'm planning to have multiple consumers with the source code in different repositories.
How do you maintain and share the auto-generated SpecificRecord POJOs?
Is sharing SpecificRecord POJOs a thing or each application should deal with deserialization using only the Avro definition from the schema registry?
Do you create a Java package dependency or what would be a good practice?

spring-cloud-task how to pass messages or flag between two apps

I have already made a Ingestion job using spring batch which reads xml file and ingest into AEM and its working fine.
Now, I am trying to convert this apps into Spring cloud Task. I want to split this apps into 4 different part which is individual apps. I need to connect them into spring cloud data workflow and pass some data and flags based on that next flow will be execute.
Is it possible on spring cloud Task? if yes then how can I bind them? please provide some programming tutorial.
In the recent 1.2.0.RELEASE, we have released a new feature called the "Composed Tasks". With this, you could define a directed graph that's made of several spring-cloud-task (SCT) applications.
Each step in your flow can be an independent SCT application, which you can develop, test, and CI/CD in isolation. Once you're ready to orchestrate them as a composed graph, you'd then register and use them in the specially designed composed-task DSL or the drag & drop GUI.
Checkout this screencast for more details.

How to set JDBC datasource targets to all servers in the cluster through WLST?

I have a WebLogic domain where in JDBC datasources are targeted to only one of the managed servers in the cluster. I need it to target to all servers in the cluster. I am able to target it to all servers through WebLogic console. But I need to perform the same through WLST. I tried set() to target to different managed servers. But what if there is a new managed server in future and i want the JDBC datasources to be pointed to the new one too.
It is very simple you could try to use Jython jarray to pass the cluster name and also mention in the last argument that is target type as 'Cluster'.
clstrNam=raw_input('Cluster Name: ')
set('Targets',jarray.array([ObjectName('com.bea:Name='+clstrNam+',Type=Cluster')], ObjectName))
Please let me know if you still have challenges.
Reference link: Generic datasource using WLST Example
HTH

Resources