Embedded Debezium in Spring multiple instances - spring

Because I need to keep track of changes in my MySQL database, I'm beginning to insert Debezium into my Spring Boot project. On the basis of this link, I implemented Embedded Debezium. However, I am having difficulty with Debezium. Is it possible to use Embedded Debezium to scale my project across multiple Kubernetes pods? Or should I customize it to avoid receiving duplicate events on each Kubernetes pod?

Related

Jaeger with ElasticSearch

I have created a microservice based architecture using Spring Boot and deployed the application on Kubernetes/Istio platform.
The different microservices communicate with each other using either JMS (ActiveMQ) or REST API.
I am getting the tracing of REST communication on Istio's Jaeger but the JMS based communication is missing in Jaeger.
I am using ElasticSearch to store my application logs.
Is it possible to use the same ElasticSearch as a backend(DB) of Jaeger?
If yes then I will store tracing specific logs in ElasticSearch and query them on Jaeger UI.
I believe you can reuse Elasticsearch for multiple purposes - each would use a different set of indices, so separation is good.
from: https://www.jaegertracing.io/docs/1.11/deployment/ :
Collectors require a persistent storage backend. Cassandra and Elasticsearch are the primary supported storage backends
Tying the networking all together, a docker-compose example:
How to configure Jaeger with elasticsearch?
While this isn't exactly what you asked, it sounds like what you're trying to achieve is seeing tracing for your JMS calls in Jaegar. If that is the case, you could use an OpenTracing tracing solution for JMS or ActiveMQ to report tracing data directly to Jaegar. Here's one potential solution I found with a quick google. There may be others.
https://github.com/opentracing-contrib/java-jms

How to connect to several cassandra clusters at same time with Spring Boot Starter Data Cassandra

I need to connect to different cassandra clusters depended on input data. I have idea how to achieve that with manually creating cassandraTemplate for each cluster. but what about spring-boot-starter-data-cassandra? Does it allow achieve same behavior?

Apache Kafka Connect With Springboot

I'm trying to find examples of kafka connect with springboot. It looks like there is no spring boot integration for kafka connect. Can some one point me in the right direction to be able to listen to changes on mysql db?
Kafka Connect doesn't really need Spring Boot because there is nothing for you to code for it, and it really works best when ran in distributed mode, as a cluster, not embedded within other (single-instance) applications. I suppose if you did want to do it, then you could copy relevent portions of the source code, but that of course isn't using Spring Boot, and you'd have to wire it all yourself
The framework itself consists of a few core Java dependencies that have already been written (Debezium or Confluent JDBC Connector, for your mysql example), and two config files. One for Kafka Connect to know the bootstrap servers, serializers, etc. and another for the actual MySQL connector. So, if you want to use Kafka Connect, run it by itself, then just write the consumer in the Spring app.
The alternatives to Kafka Connect itself would be to use Apache Camel within a Spring application (Spring Integration) or Spring Cloud Dataflow and interfacing with those Kafka "components" (which aren't using the Connect API, AFAIK)
Another option, specific for listening to MySQL, is to use Debezium Engine within your code.

How to make Spring kafka client distributed

I have messages coming in from Kafka. So I am planning to write a listener and "onMessage". I want to process it and push it in to solr.
So my question is more architectural, like I have worked on web apps all my career, so in big data how to deploy the spring kafka listener, so I can process thousands of messages a second.
How do I make my spring code use multiple nodes to distribute the
load?
I am planning to write a SpringBoot application to run in
a tomcat container.
If you use the same group id for all instances, different partitions will be assigned to different consumers (instances of your application).
So, be sure that you specified enough partitions in the topic you are going to consume.

How to monitor streaming apps Inside SCDF?

I am novice to Spring Cloud Data flow and Stream Cloud Streaming Applications.
Currently my project diagram looks like following :
I route a POST request from outside client using zuul API gateway to a microservice called Composite. Composite creates a stream using REST POST and deployes onto Spring Cloud Data Flow Server. As far as I know the microservices mongodb and file run as co-existing JVM processes. If My client has to know the status of stream, status of the processed data, How should Composite Microservice interact with Spring Cloud Data Flow Server? Currently when I make POST call to deploy the stream I dont even get the status from SCDF Server. Does SCDF expose any hooks to look at the individual apps? Also how can I change the flow #runtime to create a dynamic mesh?
Currently I am using Local Spring Cloud Data Flow Server for development.
Runtime platform is local
Local runtime is recommended only for development purpose and if you're preparing for production, please make sure to choose a platform variant (eg: cf, k8s, yarn, ..) that comes with non-functional requirements to support reliable and durable execution of all the applications running in streaming pipeline.
As far as I know the microservices mongodb and file run as co-existing JVM processes.
If your stream definition is file | mongodb, you'd have 2 different JVM's even when using Local runtime. They're independent Boot applications.
How should Composite Microservice interact with Spring Cloud Data Flow Server?
Not clear what you mean by "composite" here. All the microservice applications in SCDF communicate via messaging middleware such as Kafka or Rabbit. SCDF provides the orchestration capability to run such applications into various runtime platforms.
Currently when I make POST call to deploy the stream I dont even get the status from SCDF Server
You can use SCDF's REST-APIs to query for current status of the apps and it is platform agnostic. You can view the list of supported APIs by hitting the root URL (see image below) - there's a gap in docs - we will fix it. Following APIs could be useful for status checks.
Does SCDF expose any hooks to look at the individual apps?
Once the apps are deployed in a runtime platform, you can take advantage of Boot's actuator endpoints to explore more details such as trace, metrics, health, env among others at each application level. See Boot's actuator endpoints for more details. For instance, if your mongodb app is running locally and on port 23000, then you can check granular metrics for this application at: http://localhost:23000/metrics.
[As an FYI: future SCDF releases would include integrating Spring Boot + Spring Cloud Sleuth metrics and visual representation of the same.]
Also how can I change the flow #runtime to create a dynamic mesh?
If you're referring to editing a running streaming pipeline with addition/deletes, we are currently exploring design approach to support this functionality.

Resources