What is the best way to pass Kafka Headers in distributed applications? - spring

Currently my team and me are working in a project where we develop a distributed Kafka processing application. And we want to know how to conveniently pass Kafka Headers through these components.
What does this mean? Please have a look at the following sample image:
Our Component A receives events and generates based on the content a unique Header with metadata information about the event. After processing the result is passed into Topic A with the generated Header.
Component B and Component C receive these events, start processing and write their results into Topic B and Topic C. These components don't use the generated Headers from Component A.
But Component D needs it. So Component B and Component C must receive the header and pass it through.
Our system in the project is a little bit bigger than this example and that's why we questioned ourselves: What is the best way to pass Kafka Headers through these components. Is there an automatic way?
We considered following approaches:
Don't use Kafka Headers at all and pass the metadata information though the body.
Usage of interceptors (if that is even possible)
FYI we use Spring Boot.
Thanks in advance,
Michael

Related

Spring cloud stream RabbitMQ interceptors getting message in byte[]

I am facing the same issue mentioned in details in the following link:
https://github.com/spring-cloud/spring-cloud-stream/issues/1686
Thanks for the help.
That is because the payload type transformation occurs only when it is known which message handler (method) is going to be invoked and which type is specified as an input. Remember, there are still a lot of things that are in play here - routing would be the best example, since only after routing decision we know which handler to invoke and how to extract type information to perform type transformation
Non of it known at the interceptor level since it is too early, so the general recommendation is to NOT ever rely on payload when message is in route and instead structure your message as such that any kind of in-route decision could be made based on message headers which are simple types. A good analogy would be a post service which makes routing decisions based on the information provided on the envelope (the mailman does not open your letters to see the payload to make those decisions).
Anyway, a while back I wrote a blog post - https://spring.io/blog/2018/02/26/spring-cloud-stream-2-0-content-type-negotiation-and-transformation. .. have a read as there are more details there.

bind destinations dynamically for producers and consumers (Spring)

I'm trying to send and receive messages to channels/topics whose destination names are in a database, so they can be added/modified/deleted at runtime, but I'm surprised I have found little on the web. I'm using Spring Cloud Streams to allow to change the underlying broker.
To send messages to dynamically bound destinations I'm going with BinderAwareChannelResolver.resolveDestination(target).send(message), but I haven't found something that works like it to receive messages.
My questions are:
1. Is there something similar?
2. how can the message be processed periodically as #StreamListener does?
3. And not as important, but can you create a subscriber automatically in case there is none?
Thanks for any help!
This is a bit out of scope of the original design of the framework. But I would further question your architecture. . . If you truly desire to subscribe to unlimited amount of destinations I wonder why? What is the underlying business requirement?
Keep in mind that even if we were to do it somehow that would require creation of a message listener container dynamically for each new destination which would raise more questions, such as, how long would such container have to live since eventually you would run out of resources.
If, however, you simply asking about possibility of mapping multiple destinations to a single channel so all messages go to the same message handler (e.g., StreamListener), then you can simply use input destination property and define multiple destination delimited by comas.

Searching for architecture approach for converting Kafka Message to other formats

We're using Kafka as a broker which takes notifications from different message sources and then routes them to one or more target apps like Slack or E-Mail. Having such an approach it is necessary to convert the Kafka message into different output formats like JSON or E-Mail before they are sent to the apps.
I thought of having Microservices with SpringBoot at the target ends which takes the message from Kafka, converts it using one of the common template languages like Velocity or Freemarker into the target format and then forwards the converted result to the given target app.
Would you agree with such an approach or are there better ways, some caveats or even no-gos to do it this way? What about performance? Any experience in this?
Thanks for your honest assessment.
Why not have a single serialization format and let each service deserialize the payload for their use case? Templating with something like Velocity or Freemarker seems like a specific concern independent of the data used to populate the template. Maybe focus on broadcasting the raw data.

How a Spring WebApp manages different clients' request?

probably there are a lot of people who will smile reading this question...
Here's my problem.
I have a Spring 3 web application acting both as a client and server. It gets some XML data from a client "C", it processes them, and it sends them to a server "S".
The input XML fron C must be validated against a schema (e.g. "c.xsd") while the output XML to S must be validated against a different one (e.g. "s.xsd").
I'm using jaxb2 for marshalling and unmarshalling.
In the documentation I read that it is possible to set the "schema" attribute for the [un]/marshaller.
Therefore, I need to have an a.xsd for the validation when I get an input and a b.xsd when I produce an output... the question is the following:
when I switch the validation schema from c,xsd to s.xsd (producing an output after processing a request from C), do I change the status of the server? In other words, If I am receiving a second request form a client C2 when I'm processing the first request from C, will I attempt to validate C2 input against s.xsd? Will the application automatically put the C2 request on a different thread? If not, how can I configure spring to do so?
Thanks alot!
I'll take a stab at it:
The input XML fron C must be validated
against a schema (e.g. "c.xsd")
You can do this by setting a schema (c.xsd) on the Unmarshaller.
while the output XML to S must be
validated against a different one
(e.g. "s.xsd").
You can do this by setting a schema (s.xsd) on the Marshaller.
when I switch the validation schema
from c,xsd to s.xsd (producing an
output after processing a request from
C), do I change the status of the
server?
No, because the Unmarshaller is always using c.xsd and the Marshaller is always using s.xsd.
Since Marshaller and Unmarshaller are not thread safe you must be sure not to share them among threads.

ETL , Esper or Drools?

The question environment relates to JavaEE, Spring
I am developing a system which can start and stop arbitrary TCP (or other) listeners for incoming messages. There could be a need to authenticate these messages. These messages need to be parsed and stored in some other entities. These entities model which fields they store.
So for example if I have property1 that can have two text fields FillLevel1 and FillLevel2, I could receive messages on TCP which have both fill levels specified in text as F1=100;F2=90
Later I could add another filed say FillLevel3 when I start receiving messages F1=xx;F2=xx;F3=xx. But this is a conscious decision on the part of system modeler.
My question is what do you think is better to use for parsing and storing the message. ETL (using Pantaho, which is used in other system) where you store the raw message and use task executor to consume them one by one and store the transformed messages as per your rules.
One could use Espr or Drools to do the same thing , storing rules and executing them with timer, but I am not sure how dynamic you could get with making rules (they have to be made by end user in a running system and preferably in most user friendly way, ie no scripts or code, only GUI)
The end user should be capable of changing the parse rules. It is also possible that end user might want to change the archived data as well (for example in the above example if a new value of FillLevel is added, one would like to put a FillLevel=-99 in the previous values to make the data consistent).
Please ask for explanations, I have the feeling that I need to revise this question a bit.
Thanks
Well Esper is a great CEP engine, but drools has it's own implementation Drools Fusion which integrates really well with jBpm. That would be a good choice.

Resources