Anybody know of a good resource for a detailed (more so than the Spring Batch docs) look at the uses of JMS Item Writer/Reader in Spring Batch?
Specifically, and because I'm being tasked with trying to reuse an existing system whose only interface is asynchronous over a queue, I'm wondering if the following is possible:
Step 1: read some data and build a message.
Step 2: Drop message on queue using JMSItemWriter.
Step 3: Wait for message to come back using JMSItemReader on the response queue.
Step 4: Do some other stuff
...
Rinse and repeat, a few thousand times a day.
Or in other words, essentially using Spring Batch to force synchronous interaction with an asynchronous resource. I'd like to make sure before I get further in research, that this is A) possible, and B) not shameless abuse of the framework that will cause major headaches down the road.
Thanks in advance for any info.
Related
I am attempting to accomplish something along these lines with Quarkus, and Naryana:
client calls service to start a process that takes a while: /lra/start
This call sets off an LRA, and returns an LRA id used to track the status of the action
client can keep polling some endpoint to determine status
service eventually finishes and marks the action done through the coordinator
client sees that the action has completed, is given the result or makes another request to get that result
Is this a valid use case? Am I visualizing the correct way this tool can work? Based on how the linked guide reads, it seems that the endpoints are more of a passthrough to the coordinator, notifying it that we start and end an LRA. Is there a more programmatic way to interact with the coordinator?
Yes, it might be a valid use case, but in every case please read the MicroProfile LRA specification - https://github.com/eclipse/microprofile-lra.
The idea you describe is more or less one LRA participant executing in a new LRA and polling the status of this execution. This is not totally what the LRA is intended for, but surely can be used this way.
The main idea of LRA is the composition of distributed transactions based on the saga pattern. Basically, the point is to coordinate multiple services to achieve consistent results with an eventual consistency guarantee. So you see that the main benefit arises when you can propagate LRA through different services that either all complete their actions or all of their compensation callbacks will be called in case of failures (and, of course, only for the services that executed their actions in the first place). Here is also an example with the LRA propagation https://github.com/xstefank/quarkus-lra-trip-example.
EDIT: Sorry, I forgot to add the programmatic API that allows same interactions as annotations - https://github.com/jbosstm/narayana/blob/master/rts/lra/client/src/main/java/io/narayana/lra/client/NarayanaLRAClient.java. However, note that is not in the specification and is only specific to Narayana.
I have a problem creating/modeling integration flow for the next global use case:
Input to the system is some kind of Message. That message goes
through Splitter and Transformer Endpoint and after that on
ServiceActivator where that transformed message is processed. This
use case is clear for me.
Confusion occurs because of the next part. After the ServiceActivator
finishes processing I need to took the base Message (message from the
beginning of first part) again and put it in other processing, for example again through Splitter and Transformer. How can
I model that use case? Can I return the message payload to that base
value? Is there some component that could help me?
Hope I describe it well.
Your use-case sounds more like a PublishSubscribeChannel: https://docs.spring.io/spring-integration/docs/current/reference/html/core.html#channel-implementations-publishsubscribechannel. So, you are going to have several subscribers (splitters) for that channel and the same input message is going to be processed in those independent sub-flows. You even can do that in parallel if you configure an Executor into that PublishSubscribeChannel.
Another way, if you can do that in parallel and you still need some result from that ServiceActivator to be available alongside with an original message for the next endpoint or so, then you can use a HeaderEnricher to store an original message in the headers ad get access to it whenever you need in your flow: https://docs.spring.io/spring-integration/docs/current/reference/html/message-transformation.html#header-enricher
Context: in my country there will be a new way to Instantly Payment previewed for November. Basically, the Central Bank will provide two endpoints: (1) one POST endpoint which we post a single money transfer and (2) one GET endpoint where we get the result of a money transfer sent before and it can be completely out of order. It will answer back only on Money Transfer result and in its header will inform if there is another result we must GET. It never informs how many results are available. If there is a result it gives back on Get response and only inform if it is the last one or there is remaining ones for next GET.
Top limitation: from the moment final user clicks Transfer button in his/her mobile app until final result showing in his mobile screen if it was successful or failed is 10 seconds.
Strategy: I want a schedule which triggers each second or even less than a second a Get to Central Bank. The Scheduler will basically evoke a simple function which
Calls the Get endpoint
Pushes it to a Kafka or persist in database and
If in the answer headers it is informed more results are available, start same function again.
Issue: Since we are Spring users/followers, I though my decision was between Spring Batch versus org.springframework.scheduling.annotation.SchedulingConfigurer/TaskScheduler. I have used successfully Spring Batch for while but never for a so short period trigger (never used for 1 second period). I stumbled in discussion that drove me to think if in my case, a very simple task but with very short period, I should consider Spring Cloud Data Flow or Spring Cloud Task instead of Spring Batch.
According to this answer "... Spring Batch is ... designed for the building of complex compute problems ... You can orchestrate Spring Batch jobs with Spring Scheduler if you want". Based on that, it seems I shouldn't use Spring Batch because it isn't complex my case. The challenge design decision is more regard a short period trigger and triggering another batch from current batch instead of transformation, calculation or ETL process. Nevertheless, as far as I can see Spring Batch with its tasklet is well-designed for restarting, resuming and retrying and fits well a scenario which never finishes while org.springframework.scheduling seems to be only a way to trigger an event based on period configuration. Well, this is my filling based on personal uses and studies.
According to an answer to someone asking about orchestration for composed tasks this answer "... you can achieve your design goals using Spring Cloud Data Flow along with the Spring Cloud Task/Spring Batch...". In my case, I don't see composed tasks. In my case, the second trigger doesn't depend on result from previous one. It sounds more as "chained" tasks instead of "composed". I have never used Spring Cloud Data Flow but it seems a nice candidate for Manage/View/Console/Dashboards the triggered task. Nevertheless, I didn't find anywhere informing limitations or rule of thumbs for short periods triggers and "chained" triggers.
So my straight question is: what is the current recommend Spring members for a so short period trigger? Assuming Spring Cloud Data Flow is used for manager/dashboard what is the trigger member from Spring recommended in so short trigger scenarios? It seems Spring Cloud Task is designed for calling complex functions and Spring Batch seems to add too much than I need and org.springframework.scheduling.* missing integration with Spring Cloud Data Flow. As an analogy and not as comparison, in AWS, the documentation clear says "don't use CloudWatch for less than one minute. If you want less than one minute, start CloudWatch for each minute that start another scheduler/cron each second". There might be a well-know rule of thumb for a simple task that needs to be trigger each second or even less than one second and take advantage of Spring family approach/concerns/experience.
This may be stupid answer. Why do you need scheduler here?. Wouldn't a never ending job will achieve the goal here?
You start a job, it does a GET request, push the result to kafka,
If the GET response indicated, it had more results, it immediately does a GET again, push the result to kafka
If the GET response indicated, there are no more results, sleep for 1 second, do the GET request again.
I am stuck in a typical use case or scenario where I am not sure what will be the behavior of Kafka..
SCENERIO : I am using Spring Kafka with spring Boot. In my application I am having one Rest end point which will read all messages from the beginning of a topic to check for the duplication of message then write to topic if not duplicate.
I am confused about what will be the behavior of the application when multiple instances of same microservice are deployed and offset is moved for seekFromBegining operation.
few questions in my mind are :
do reading from beginning of a topic (with the help of seek) block the topic ?
If Yes. then how to solve this typical use case where we have to validate for the
duplication of message before writing to the topic.
Using DB is not a solution because it will be resource intensive. and make the application slower.
Thanks everyone in Advance
Sounds like you need a Log Compaction feature:
Log compaction ensures that Kafka will always retain at least the last known value for each message key within the log of data for a single topic partition.
Therefore when you specify some unique message key, you won't have more than one of them in the partition. And with that you don't need to read topic before storing at all.
I have a MassTransit system that will consume 2 message types, one for a batch process, the other for CRUD operations on a single entity. Whilst the batch process is running, the CRUD operations should not be de-queued.
Is this possible to achieve using MassTransit? It seems the exchange binding -> type name, would potentially make this behavior difficult.
A solution would be to use one message type to denote both operations and then interrogate the message contents to discern between single and batch but this feels like a code smell. Also, this would require concurrency configuration to ensure only one consumer is ever active.
Can anyone help with an alternative solution here? Essentially, we need to pause all message consumption whilst an event driven process is running.
Thanks in advance.
By pause, do you mean that you want the CRUD operations to be able to occur without being blocked by the batch process? Because if it's only a matter of not having the two separate messages get in the way of each other, the most logical solution is using two separate queues, one receive endpoint for the batch process and another for the CRUD operations.
Now, if you truly need to separate the batch process such that it doesn't happen during the CRUD operations, that will require more work. And what if you receive a CRUD operation while the batch process is already running?
I think the separate queues is your best solution, however.