Any concept of a global variable in streams in spring xd? - spring-xd

Scenario: A stream definition in spring xd has the following structure:
jms | filter | transform | hdfs
In the filter module, I fire a query to a database to verify if the current message is applicable for further processing.
When the condition is met, the message passes on to the transform module.
In the transform module, I would like to have access to the query results from the filter module.
Currently, I end up having to fire a query once more inside the transform to access the same result set.
Is there any form of a global variable that can apply during the lifetime of a message passing from source to sink across different modules? This could help reduce latency of reading from database.
If this isn't possible, what would be a recommended alternative?

You typically would use a transformer for this, or a header-enricher, to set a message header with the query result; use that header in the filter, and the header will be available for downstream modules, including your transformer.
<int:chain input-channel="input" output-channel="output">
<int:header-enricher..../>
<int:filter ... />
</int:chain>
This (passing arbitrary headers) currently only works (out of the box) with the rabbit (and local) transport, or if direct binding is enabled.
When using the redis transport, you have to configure the bus to add your header to those it passes.

Related

What is the best way to pass Kafka Headers in distributed applications?

Currently my team and me are working in a project where we develop a distributed Kafka processing application. And we want to know how to conveniently pass Kafka Headers through these components.
What does this mean? Please have a look at the following sample image:
Our Component A receives events and generates based on the content a unique Header with metadata information about the event. After processing the result is passed into Topic A with the generated Header.
Component B and Component C receive these events, start processing and write their results into Topic B and Topic C. These components don't use the generated Headers from Component A.
But Component D needs it. So Component B and Component C must receive the header and pass it through.
Our system in the project is a little bit bigger than this example and that's why we questioned ourselves: What is the best way to pass Kafka Headers through these components. Is there an automatic way?
We considered following approaches:
Don't use Kafka Headers at all and pass the metadata information though the body.
Usage of interceptors (if that is even possible)
FYI we use Spring Boot.
Thanks in advance,
Michael

Split request - Send to either one or diff end points - combine the response

I am using spring boot 2 and apache camel 2.24 to build an api gateway which exposes REST end points to receive JSON/XML request and do the following
Validate incoming JSON/XML
Convert incoming request to downstream expected format.
Push the request to a camel route which invokes the downstream REST end point and returns back the response.
Convert response to expected format and return back to client.
Currently I have route configured as below
from("direct::camel").process(preprocessor).process(httpClientProcessor).process(postprocessor);
httpClientProcessor - This does the main job of invoking downstream end point and return the response.
preprocessor - splits the request for configured requests, puts them in a list before passing to httpClientProcessor.
postprocessor - does the following based on content type
XML - Remove "xml" tag from individual responses and combine to form one single response under one root element
JSON - Combine json response under one parent array element.
There can be cases where the multiple requests need to be sent to the same end point or each to be sent to a unique end point. Currently I have that logic in httpClientProcessor. One issue with this approach is that I can invoke the downstream end points only one after another rather than in parallel (unless I add thread pool executor in httpClientProcessor)
I am new to apache camel and hence started with this basis route configuration. Based on going through the documentation, I came across camel components like split(), parallelProcessing(), multicast and aggregator components of camel but I am not sure how to plug these all together to achieve my requirement -
Split the incoming request using a pre configured delimiter to create multiple requests. The incoming request may or may not have multiple requests.
Based on endpoint url configuration, either post all split requests to same end point or each request to a unique endpoint.
Combine responses from all to form one master response (XML or JSON) as output from the route.
Please advise.
This sounds like you should have a look a the Camel Split EIP.
Especially
How to split a request into parts and re-aggregate the parts later
How to process each part in parallel
For the dynamic downstream endpoints, you can use a dynamic To (.toD(...)) or the Recipient List EIP. The latter can even send a message to multiple or no endpoint.

Kafka streams using context forward from processor called in dsl api

I have a processor and would like to call context.forward() in it. However I feel like I need to set a sink topic for it to actually get forwarded. If I was using the Toplogy I would just .addSource(), .addProcessor(), .addSink(). However with the DSL I have a StreamsBuilder/KStream. Is there anyway to use context.forward() when calling a processor from the dsl?
NOTE: I need to use a processor instead of a transform as I have custom logic on when to forward records down stream.
stream.process(() -> new WindowAggregatorProcessor(storeName), storeName);
stream.process() is a terminal operation in the DSL. You can use stream.transform() instead to get an output stream. A Transformer is basically the same as a Processor.

Using Spring Integration to split a large XML file into individual smaller messages and process each individually

I am using Spring Integration and have a large XML file containing a collection of child items, I want to split the file into a set of messages, the payload of each message will be one of the child XML fragments.
Using splitter is the obvious but this requires returning a collection of messages and this will exhaust the memory; I need to split the file into individual messages but process them one at a time (or more likely with a multi threaded task-executor).
Is there a standard way to do this without writing a custom component that writes the sub-messages to a channel programatically.
i have been looking for a similar solution and I have not found either any standard way of doing this.
Here is a rather dirty fix, if anyone needs this behavior implemented:
Split the files manually using a Service Activator or a Splitter with a custom bean.
<int:splitter input-channel="rawChannel" output-channel="splitChannel" id="splitter" >
<bean class="com.a.b.c.MYSplitter" />
</int:splitter>
Your custom bean should implement ApplicationContextAware so the application context can be injected by Spring.
Manually retrieve the output channel and send each sub-message
MessageChannel xsltChannel = (MessageChannel) applicationContext.getBean("splitChannel");
Message<String> message = new GenericMessage<String>(payload));
splitChannel.send(message);
For people coming across this very old question. Splitters can now handle results of type Iterable, Iterator, Stream, and Flux (project reactor). If any of these types are returned, messages are emitted one-at-a-time.
Iterator/Iterable since 4.0.4; Stream/Flux since 5.0.0.
There is also now a FileSplitter which emits file contents a line-at-a-time via an Interator - since 4.1.2.

How a Spring WebApp manages different clients' request?

probably there are a lot of people who will smile reading this question...
Here's my problem.
I have a Spring 3 web application acting both as a client and server. It gets some XML data from a client "C", it processes them, and it sends them to a server "S".
The input XML fron C must be validated against a schema (e.g. "c.xsd") while the output XML to S must be validated against a different one (e.g. "s.xsd").
I'm using jaxb2 for marshalling and unmarshalling.
In the documentation I read that it is possible to set the "schema" attribute for the [un]/marshaller.
Therefore, I need to have an a.xsd for the validation when I get an input and a b.xsd when I produce an output... the question is the following:
when I switch the validation schema from c,xsd to s.xsd (producing an output after processing a request from C), do I change the status of the server? In other words, If I am receiving a second request form a client C2 when I'm processing the first request from C, will I attempt to validate C2 input against s.xsd? Will the application automatically put the C2 request on a different thread? If not, how can I configure spring to do so?
Thanks alot!
I'll take a stab at it:
The input XML fron C must be validated
against a schema (e.g. "c.xsd")
You can do this by setting a schema (c.xsd) on the Unmarshaller.
while the output XML to S must be
validated against a different one
(e.g. "s.xsd").
You can do this by setting a schema (s.xsd) on the Marshaller.
when I switch the validation schema
from c,xsd to s.xsd (producing an
output after processing a request from
C), do I change the status of the
server?
No, because the Unmarshaller is always using c.xsd and the Marshaller is always using s.xsd.
Since Marshaller and Unmarshaller are not thread safe you must be sure not to share them among threads.

Resources