Spring integration Scatter-Gather pattern with JMS transport - spring

I need to implement the following architecture:
I have data that must be sent to systems (Some external application ) using JMS.
Depending on the data you need to send only to the necessary systems (For example, if the number of systems is 4, then you can send from 1 to 4 )
It is necessary to wait for a response from the systems to which the messages were sent, after receiving all the answers, it is required to process the received data (or to process at least one timeout)
The correlation id is contained in the header of both outgoing and incoming JMS messages
Each new such process can be started asynchronously and in parallel
Now I have it implemented only with the help of Spring JMS. I synchronize the threads manually, also manually I manage the thread pools.
The correlation ids and information about the systems in which messages were sent are stored as a state and update it after receiving new messages, etc.
But I want to simplify the logic and use Spring-integration Java DSL, Scatter gather pattern (Which is just my case) and other useful Spring features.
Can you help me show an example of how such an architecture can be implemented with the help of Spring-integration/IntregrationFlow?

Here is some sample from our test-cases:
#Bean
public IntegrationFlow scatterGatherFlow() {
return f -> f
.scatterGather(scatterer -> scatterer
.applySequence(true)
.recipientFlow(m -> true, sf -> sf.handle((p, h) -> Math.random() * 10))
.recipientFlow(m -> true, sf -> sf.handle((p, h) -> Math.random() * 10))
.recipientFlow(m -> true, sf -> sf.handle((p, h) -> Math.random() * 10)),
gatherer -> gatherer
.releaseStrategy(group ->
group.size() == 3 ||
group.getMessages()
.stream()
.anyMatch(m -> (Double) m.getPayload() > 5)),
scatterGather -> scatterGather
.gatherTimeout(10_000));
}
So, there is the parts:
scatterer - to send messages to recipients. In your case all those JMS services. That can be a scatterChannel though. Typically PublishSubscribeChannel, so Scatter-Gather might not know subscrbibers in adavance.
gatherer - well, it is just an aggregator with all its possible options.
scatterGather - is just for convenience for the direct properties of the ScatterGatherHandler and common endpoint options.

Related

Jdbi transaction - multiple methods - Resources should be closed

Suppose I want to run two sql queries in a transaction I have code like the below:
jdbi.useHandle(handle -> handle.useTransaction(h -> {
var id = handle.createUpdate("some query")
.executeAndReturnGeneratedKeys()
.mapTo(Long.class).findOne().orElseThrow(() -> new IllegalStateException("No id"));
handle.createUpdate("INSERT INTO SOMETABLE (id) " +
"VALUES (:id , xxx);")
.bind("id")
.execute();
}
));
Now as the complexity grows I want to extract each update in into it's own method:
jdbi.useHandle(handle -> handle.useTransaction(h -> {
var id = someQuery1(h);
someQuery2(id, h);
}
));
...with someQuery1 looking like:
private Long someQuery1(Handle handle) {
return handle.createUpdate("some query")
.executeAndReturnGeneratedKeys()
.mapTo(Long.class).findOne().orElseThrow(() -> new IllegalStateException("No id"));
}
Now when I refactor to the latter I get a SonarQube blocker bug on the someQuery1 handle.createUpdate stating:
Resources should be closed
Connections, streams, files, and other
classes that implement the Closeable interface or its super-interface,
AutoCloseable, needs to be closed after use....*
I was under the impression, that because I'm using jdbi.useHandle (and passing the same handle to the called methods) that a callback would be used and immediately release the handle upon return. As per the jdbi docs:
Both withHandle and useHandle open a temporary handle, call your
callback, and immediately release the handle when your callback
returns.
Any help / suggestions appreciated.
TIA
SonarQube doesn't know any specifics regarding JDBI implementation and just triggers by AutoCloseable/Closable not being closed. Just suppress sonar issue and/or file a feature-request to SonarQube team to improve this behavior.

Spring Integration DSL how to route split messages to different concurrent flows?

I probably hate writing noob questions as much as other people hate answering them, but here goes.
I need to split a message retrieved from a JdbcPollingChannelAdapter into multiple messages based on the operation requested in each row of the resultset in the payload.
The split operation is simple enough. What is proving to be a challenge is conditionally routing the message to one flow or the other.
After much trial and error, I believe that this flow represents my intention
/- insertUpdateAdapter -\
Poll Table -> decorate headers -> split -> router -< >- aggregator -> cleanup
\---- deleteAdapter ----/
TO that end I have constructed this Java DSL:
final JdbcOutboundGateway inboundAdapter = createInboundAdapter();;
final JdbcOutboundGateway deleteAdapter = createDeleteAdapter();
final JdbcOutboundGateway insertUpdateAdapter = createInsertUpdateAdapter();
return IntegrationFlows
.from(setupAdapter,
c -> c.poller(Pollers.fixedRate(1000L, TimeUnit.MILLISECONDS).maxMessagesPerPoll(1)))
.enrichHeaders(h -> h.headerExpression("start", "payload[0].get(\"start\")")
.headerExpression("end", "payload[0].get(\"end\")"))
.handle(inboundAdapter)
.split(insertDeleteSplitter)
.enrichHeaders(h -> h.headerExpression("operation", "payload[0].get(\"operation\")"))
.channel(c -> c.executor("stepTaskExecutor"))
.routeToRecipients (r -> r
.recipientFlow("'I' == headers.operation or 'U' == headers.operation",
f -> f.handle(insertUpdateAdapter))
// This element is complaining "Syntax error on token ")", ElidedSemicolonAndRightBrace expected"
// Attempted to follow patterns from https://github.com/spring-projects/spring-integration-java-dsl/wiki/Spring-Integration-Java-DSL-Reference#routers
.recipientFlow("'D' == headers.operation",
f -> f.handle(deleteAdapter))
.defaultOutputToParentFlow())
)
.aggregate()
.handle(cleanupAdapter)
.get();
Assumptions I have made, based on prior work include:
The necessary channels are auto-created as Direct Channels
Route To Recipients is the appropriate tool for this function (I have also considered expression router, but the examples of how to add sub-flows were less clear than the Route To Recipients)
Insert an ExecutorChannel somewhere between the splitter and router if you want to run the splits in parallel. You can limit the pool size of the executor to control the concurrency.
There is an extra parenthesis after .defaultOutputToParentFlow())
The corrected code is:
return IntegrationFlows
.from(setupAdapter,
c -> c.poller(Pollers.fixedRate(1000L, TimeUnit.MILLISECONDS).maxMessagesPerPoll(1)))
.enrichHeaders(h -> h.headerExpression("ALC_startTime", "payload[0].get(\"ALC_startTime\")")
.headerExpression("ALC_endTime", "payload[0].get(\"ALC_endTime\")"))
.handle(inboundAdapter)
.split(insertDeleteSplitter)
.enrichHeaders(h -> h.headerExpression("ALC_operation", "payload[0].get(\"ALC_operation\")"))
.channel(c -> c.executor(stepTaskExecutor))
.routeToRecipients (r -> r
.recipientFlow("'I' == headers.ALC_operation or 'U' == headers.ALC_operation",
f -> f.handle(insertUpdateAdapter))
// This element is complaining "Syntax error on token ")", ElidedSemicolonAndRightBrace expected"
// Attempted to follow patterns from https://github.com/spring-projects/spring-integration-java-dsl/wiki/Spring-Integration-Java-DSL-Reference#routers
.recipientFlow("'D' == headers.ALC_operation",
f -> f.handle(deleteAdapter))
.defaultOutputToParentFlow())
.aggregate()
.handle(cleanupAdapter)
.get();

JMS Message failed to receive

I have a connection session pool, through which the listener class receive the message being published on the topic using StompJMS selector via apollo (apache-apollo-1.7.1).
Say I've 4 objects (ob1, ob2, ob3, ob4) listening from the topic using the same connection session pool,firstly the registration request has been sent for all 4 objects using the 4 selectors (s1, s2, s3, s4) to receive a set of features (a, b, c) from topic.
The DeliveryMode is set to Persistent for jms Producer.
The selectors look like:
s1: "((SYMBOL_NAME='ob1.name()') AND ( MESSAGE_TYPE='SIGNAL') AND ((SIGNAL_NAME='A') OR (SIGNAL_NAME='B') OR (SIGNAL_NAME='C') OR(SIGNAL_NAME='D'))"
s2: "((SYMBOL_NAME='ob2.name()') AND( MESSAGE_TYPE='SIGNAL') AND ((SIGNAL_NAME='A') OR (SIGNAL_NAME='B') OR (SIGNAL_NAME='C') OR(SIGNAL_NAME='D'))"
s3: "((SYMBOL_NAME='ob3.name()') AND( MESSAGE_TYPE='SIGNAL') AND ((SIGNAL_NAME='A') OR (SIGNAL_NAME='B') OR (SIGNAL_NAME='C') OR(SIGNAL_NAME='D'))"
s4: "((SYMBOL_NAME='ob4.name()') AND ( MESSAGE_TYPE='SIGNAL') AND ((SIGNAL_NAME='A') OR (SIGNAL_NAME='B') OR (SIGNAL_NAME='C') OR(SIGNAL_NAME='D'))"
Another java application is publishing the features a, b, c for ob1, ob2, ob3, ob4 on a topic.
The listener class is receiving the values for any 3 say (ob1, ob2, ob3) , but its' not receiving the values for ob4, and sometimes it's receiving for (ob1, ob3, ob4) and not for ob2. The object its' not receiving for is not fixed.
One reason I can think of is the jms selector failed to pick up features for ob4 or other can be the apollo connection broke down, later seems to be very unlikely because in that case other objects would also have been affected.
Please let me know if there's an issue with the selector.

How to safely write to one file from many verticle instances in vert.x 3.2?

Instead of using a logger or database server I'd like to append information to one file from possibly many verticle instances.
There are versions of methods for writing asynchronously to a file.
Can I assume that vertx handles the synchronisation between the writes so that these dont interfere when using those versions of methods marked as ¨async¨ ?
There seems to be a rule that one can rely on vertx providing all isolation between concurrent processing out of the box. But is that true in case of writing file access?
Could you please include a code snippet into the answer that shows how to open and write to one file from many verticle instances with finest possible granularity, e.g. for logging requests.
I wouldn't recommend writing to a single file with many different "writers". Regarding concurrent logging I would stick to the Single Writer principle.
Create a Verticle which subscribes to the Event Bus and listens for messages to be logged. Lets call this Verticle Logger which listens to system.logger.
EventBus eb = vertx.eventBus();
eb.consumer("system.logger", message -> {
// write to file
});
Verticles which like to log something need to send a message to the Logger Verticle:
eventBus.send("system.logger", "foobar");
Appending to a existing file work something like this (didn't test):
vertx.fileSystem().open("file.log", new OpenOptions(), result -> {
if (result.succeeded()) {
Buffer buff = Buffer.buffer(message); // message from consume
AsyncFile file = result.result();
file.write(buff, buff.length() * i, ar -> {
if (ar.succeeded()) {
System.out.println("done");
} else {
System.err.println("write failed: " + ar.cause());
}
});
} else {
System.err.println("open file failed " + result.cause());
}
});

Spark Streaming with multiple Kafka streams

I create kafka stream with the following codes:
val streams = (1 to 5) map {i =>
KafkaUtils.createStream[....](
streamingContext,
Map( .... ),
Map(topic -> numOfPartitions),
StorageLevel.MEMORY_AND_DISK_SER
).filter(...)
.mapPartitions(...)
.reduceByKey(....)
val unifiedStream = streamingContext.union(streams)
unifiedStream.foreachRDD(...)
streamingContext.start()
I give each stream different group id. When I run the application, only part of kafka messages are received and the executor is pending at foreachRDD call. If I only create one stream, everything works well. There aren't any exceptions from logging info.
I don't know why the application is stuck there. Does it mean no enough resources?
You want to try set the parameter
SparkConf().set("spark.streaming.concurrentJobs", "5")

Resources