I have following app requirements:
Messages are received from RabbitMq, and then aggregated based on some more complex rules e.g. - based on types property (with pre-given type-time mapping) and based on existing time message has been waiting in the queue (old property)
All messages should be released at some variable message rate, e.g. 1msg/sec up to 100msg/sec. This rate is controlled and set by service that will monitor rabbitmq queue size (one queue that is not related to this component, and is further up the pipeline) and if too much messages are in queue - would decrease the rate.
As you can see in the image one use-case: three messages are already aggregated and are waiting to be released next second (since current rate is 1msg/sec), but just at that time, MSG arrives with id:10, and it updated AGGREGATED 2, making it become 1st message by priority. So on next tick, instead of releasing AGGREGATED 3, we release AGGREGATED 2 since it has now higher priority.
Now, the question is - can I use Spring Integration Aggregator for this, since I do not know if it supports prioritization of messages during aggregation? I know of groupTimeout, but that one is only adjusting single message group - not changing priority of other groups. Would it be possible to use MessageGroupStoreReaper that would adjust all other aggregated messages by priority when new MSG arrives?
UPDATE
I did some implementation like this - seems OK for now - it is aggregating messages as it arrives, and comparator is sorting messages by my custom logic.
Do you think there could be some problems with this (concurrency etc.)? I can see in the logs, that poller is invoked more than once on occations. Is this normal?
2021-01-18 13:52:05.277 INFO 16080 --- [ scheduling-1] ggregatorConfig$PriorityAggregatingQueue : POLL
2021-01-18 13:52:05.277 INFO 16080 --- [ scheduling-1] ggregatorConfig$PriorityAggregatingQueue : POLL
2021-01-18 13:52:05.277 INFO 16080 --- [ scheduling-1] ggregatorConfig$PriorityAggregatingQueue : POLL
2021-01-18 13:52:05.277 INFO 16080 --- [ scheduling-1] ggregatorConfig$PriorityAggregatingQueue : POLL
Also, is this commented doit method, proper way to increase max number of polled messages in runtime?
#Bean
public MessageChannel aggregatingChannel(){
return new QueueChannel(new PriorityAggregatingQueue<>((m1, m2) -> {//aggr here},
Comparator.comparingInt(x -> x),
(m) -> {
ExampleDTO d = (ExampleDTO) m.getPayload();
return d.getId();
}
));
}
class PriorityAggregatingQueue<K> extends AbstractQueue<Message<?>> {
private final Log logger = LogFactory.getLog(getClass());
private final BiFunction<Message<?>, Message<?>, Message<?>> accumulator;
private final Function<Message<?>, K> keyExtractor;
private final NavigableMap<K, Message<?>> keyToAggregatedMessage;
public PriorityAggregatingQueue(BiFunction<Message<?>, Message<?>, Message<?>> accumulator,
Comparator<? super K> comparator,
Function<Message<?>, K> keyExtractor) {
this.accumulator = accumulator;
this.keyExtractor = keyExtractor;
keyToAggregatedMessage = new ConcurrentSkipListMap<>(comparator);
}
#Override
public Iterator<Message<?>> iterator() {
return keyToAggregatedMessage.values().iterator();
}
#Override
public int size() {
return keyToAggregatedMessage.size();
}
#Override
public boolean offer(Message<?> m) {
logger.info("OFFER");
return keyToAggregatedMessage.compute(keyExtractor.apply(m), (k,old) -> accumulator.apply(old, m)) != null;
}
#Override
public Message<?> poll() {
logger.info("POLL");
Map.Entry<K, Message<?>> m = keyToAggregatedMessage.pollLastEntry();
return m != null ? m.getValue() : null;
}
#Override
public Message<?> peek() {
Map.Entry<K, Message<?>> m = keyToAggregatedMessage.lastEntry();
return m!= null ? m.getValue() : null;
}
}
// #Scheduled(fixedDelay = 10*1000)
// public void doit(){
// System.out.println("INCREASE POLL");
// pollerMetadata().setMaxMessagesPerPoll(pollerMetadata().getMaxMessagesPerPoll() * 2);
// }
#Bean(name = PollerMetadata.DEFAULT_POLLER)
public PollerMetadata pollerMetadata(){
PollerMetadata metadata = new PollerMetadata();
metadata.setTrigger(new DynamicPeriodicTrigger(Duration.ofSeconds(30)));
metadata.setMaxMessagesPerPoll(1);
return metadata;
}
#Bean
public IntegrationFlow aggregatingFlow(
AmqpInboundChannelAdapter aggregatorInboundChannel,
AmqpOutboundEndpoint aggregatorOutboundChannel,
MessageChannel wtChannel,
MessageChannel aggregatingChannel,
PollerMetadata pollerMetadata
) {
return IntegrationFlows.from(aggregatorInboundChannel)
.wireTap(wtChannel)
.channel(aggregatingChannel)
.handle(aggregatorOutboundChannel)
.get();
}
Well, if there is a new message for group to complete it arrives into an aggregator, then such a group is released immediately (if your ReleaseStrategy says that though). The rest of group under timeout will continue to wait for the schedule.
It is probably possible to come up with smart algorithm to rely on a single common schedule with the MessageGroupStoreReaper to decide if we need to release that partial group or just discard it. Again: the ReleaseStrategy should give us a clue to release or not, even if partial. When discard happens and we want to keep those messages in the aggregator, we need to resend them back to the aggregator after some delay. After expiration the group is removed from the store and this happens when we have already sent into a discard channel, so it is better to delay them and let an aggregator to clean up those groups, so after delay we can safely send them back to the aggregator for a new expiration period as parts of new groups.
You probably also can iterate all of the messages in the store after releases normal group to adjust some time key in their headers for the next expiration time.
I know this is hard matter, but there is really no any out-of-the-box solution since it was not designed to affect other groups from one we just dealt with...
Related
I'm trying to use the idle between polls mentioned here to slow down the consumption rate, i also use the max.poll.interval.ms to double the idle between polls, but its always triggering partition rebalance, any idea what is the problem?
[Edit]
I have 5 hosts and i'm setting concurrency level to 1
[Edit 2]
I was setting the idle between polls to 5 min and max.poll.interval.ms to 10 min i also noticed this log "About to close the idle connection from 105 due to being idle for 540012 millis".
I decreased the idle between polls to 10 sec and the issue disappeared, any idea why?
private ConsumerFactory<String, GenericRecord> dlqConsumerFactory() {
Map<String, Object> configurationProperties = commonConfigs();
DlqConfiguration dlqConfiguration = kafkaProperties.getConsumer().getDlq();
final Integer idleBetweenPollInterval = dlqConfiguration.getIdleBetweenPollInterval()
.orElse(DLQ_POLL_INTERVAL);
final Integer maxPollInterval = idleBetweenPollInterval * 2; // two times the idleBetweenPoll, to prevent re-balancing
logger.info("Setting max poll interval to {} for DLQ", maxPollInterval);
overrideIfRequired(DQL_CONSUMER_CONFIGURATION, configurationProperties, ConsumerConfig.MAX_POLL_INTERVAL_MS_CONFIG, maxPollInterval);
dlqConfiguration.getMaxPollRecords().ifPresent(maxPollRecords ->
overrideIfRequired(DQL_CONSUMER_CONFIGURATION, configurationProperties, ConsumerConfig.MAX_POLL_RECORDS_CONFIG, maxPollRecords)
);
return new DefaultKafkaConsumerFactory<>(configurationProperties);
}
<time to process last polled records> + <idle between polls> must be less than max.poll.interval.ms.
EDIT
There is logic in the container to make sure we never exceed the max poll interval:
idleBetweenPolls = Math.min(idleBetweenPolls,
this.maxPollInterval - (System.currentTimeMillis() - this.lastPoll)
- 5000); // NOSONAR - less by five seconds to avoid race condition with rebalance
I can't reproduce the issue with this...
#SpringBootApplication
public class So63411124Application {
public static void main(String[] args) {
SpringApplication.run(So63411124Application.class, args);
}
#KafkaListener(id = "so63411124", topics = "so63411124")
public void listen(String in) {
System.out.println(in);
}
#Bean
public ApplicationRunner runner(ConcurrentKafkaListenerContainerFactory<?, ?> factory,
KafkaTemplate<String, String> template) {
factory.getContainerProperties().setIdleBetweenPolls(300000L);
return args -> {
while (true) {
template.send("so63411124", "foo");
Thread.sleep(295000);
}
};
}
#Bean
public NewTopic topic() {
return TopicBuilder.name("so63411124").partitions(1).replicas(1).build();
}
}
logging.level.org.springframework.kafka=debug
spring.kafka.consumer.auto-offset-reset=earliest
spring.kafka.consumer.properties.max.poll.interval.ms=600000
If you can provide a small example like this that exhibits the behavior you describe, I will take a look to see what's wrong.
I've imlemented my own Transformer, where I'm accumulating events in my StateStore (into the internal changelog kafka topic), and once it accumulates X events, I emmit one accumulated event down the stream.
Everything is perfect, but I don't want accumulated events to be stuck in the pipe if there are no new events are posted for a while.
That's why I decided to use Punctuator:
#Override
public void init(ProcessorContext context) {
this.context = context;
this.kvStore = (KeyValueStore<Patient, Long>) this.context.getStateStore(stateStore);
this.context.schedule(Duration.ofMinutes(180), PunctuationType.WALL_CLOCK_TIME, (timestamp) -> {
KeyValueIterator<Event, Long> iter = kvStore.all();
while (iter.hasNext()) {
KeyValue<Event, Long> entry = iter.next();
...
}
iter.close();
context.commit();
});
}
The problem is that the entries in the StateStore do not contain time sent information. Is there a way to access that? Do I need to implement my own custom StateStore in order to make it work?
Thank you!
I am trying to figure out the best way to handle errors that might have occurred in a service that is called after a aggregate's group timeout occurred that mimics the same flow as if the releaseExpression was met.
Here is my setup:
I have a AmqpInboundChannelAdapter that takes in messages and send them to my aggregator.
When the releaseExpression has been met and before the groupTimeout has expired, if an exception gets thrown in my ServiceActivator, the messages get sent to my dead letter queue for all the messages in that MessageGroup. (10 messages in my example below, which is only used for illustrative purposes) This is what I would expect.
If my releaseExpression hasn't been met but the groupTimeout has been met and the group times out, if an exception gets throw in my ServiceActivator, then the messages do not get sent to my dead letter queue and are acked.
After reading another blog post,
link1
it mentions that this happens because the processing happens in another thread by the MessageGroupStoreReaper and not the one that the SimpleMessageListenerContainer was on. Once processing moves away from the SimpleMessageListener's thread, the messages will be auto ack.
I added the configuration mentioned in the link above and see the error messages getting sent to my error handler. My main question, is what is considered the best way to handle this scenario to minimize message getting lost.
Here are the options I was exploring:
Use a BatchRabbitTemplate in my custom error handler to publish the failed messaged to the same dead letter queue that they would have gone to if the releaseExpression was met. (This is the approach I outlined below but I am worried about messages getting lost, if an error happens during publishing)
Investigate if there is away I could let the SimpleMessageListener know about the error that occurred and have it send the batch of messages that failed to a dead letter queue? I doubt this is possible since it seems the messages are already acked.
Don't set the SimpleMessageListenerContainer to AcknowledgeMode.AUTO and manually ack the messages when they get processed via the Service when the releaseExpression being met or the groupTimeOut happening. (This seems kinda of messy, since there can be 1..N message in the MessageGroup but wanted to see what others have done)
Ideally, I want to have a flow that will that will mimic the same flow when the releaseExpression has been met, so that the messages don't get lost.
Does anyone have recommendation on the best way to handle this scenario they have used in the past?
Thanks for any help and/or advice!
Here is my current configuration using Spring Integration DSL
#Bean
public SimpleMessageListenerContainer workListenerContainer() {
SimpleMessageListenerContainer container =
new SimpleMessageListenerContainer(rabbitConnectionFactory);
container.setQueues(worksQueue());
container.setConcurrentConsumers(4);
container.setDefaultRequeueRejected(false);
container.setTransactionManager(transactionManager);
container.setChannelTransacted(true);
container.setTxSize(10);
container.setAcknowledgeMode(AcknowledgeMode.AUTO);
return container;
}
#Bean
public AmqpInboundChannelAdapter inboundRabbitMessages() {
AmqpInboundChannelAdapter adapter = new AmqpInboundChannelAdapter(workListenerContainer());
return adapter;
}
I have defined a error channel and defined my own taskScheduler to use for the MessageStoreRepear
#Bean
public ThreadPoolTaskScheduler taskScheduler(){
ThreadPoolTaskScheduler ts = new ThreadPoolTaskScheduler();
MessagePublishingErrorHandler mpe = new MessagePublishingErrorHandler();
mpe.setDefaultErrorChannel(myErrorChannel());
ts.setErrorHandler(mpe);
return ts;
}
#Bean
public PollableChannel myErrorChannel() {
return new QueueChannel();
}
public IntegrationFlow aggregationFlow() {
return IntegrationFlows.from(inboundRabbitMessages())
.transform(Transformers.fromJson(SomeObject.class))
.aggregate(a->{
a.sendPartialResultOnExpiry(true);
a.groupTimeout(3000);
a.expireGroupsUponCompletion(true);
a.expireGroupsUponTimeout(true);
a.correlationExpression("T(Thread).currentThread().id");
a.releaseExpression("size() == 10");
a.transactional(true);
}
)
.handle("someService", "processMessages")
.get();
}
Here is my custom error flow
#Bean
public IntegrationFlow errorResponse() {
return IntegrationFlows.from("myErrorChannel")
.<MessagingException, Message<?>>transform(MessagingException::getFailedMessage,
e -> e.poller(p -> p.fixedDelay(100)))
.channel("myErrorChannelHandler")
.handle("myErrorHandler","handleFailedMessage")
.log()
.get();
}
Here is the custom error handler
#Component
public class MyErrorHandler {
#Autowired
BatchingRabbitTemplate batchingRabbitTemplate;
#ServiceActivator(inputChannel = "myErrorChannelHandler")
public void handleFailedMessage(Message<?> message) {
ArrayList<SomeObject> payload = (ArrayList<SomeObject>)message.getPayload();
payload.forEach(m->batchingRabbitTemplate.convertAndSend("some.dlq","#", m));
}
}
Here is the BatchingRabbitTemplate bean
#Bean
public BatchingRabbitTemplate batchingRabbitTemplate() {
ThreadPoolTaskScheduler scheduler = new ThreadPoolTaskScheduler();
scheduler.setPoolSize(5);
scheduler.initialize();
BatchingStrategy batchingStrategy = new SimpleBatchingStrategy(10, Integer.MAX_VALUE, 30000);
BatchingRabbitTemplate batchingRabbitTemplate = new BatchingRabbitTemplate(batchingStrategy, scheduler);
batchingRabbitTemplate.setConnectionFactory(rabbitConnectionFactory);
return batchingRabbitTemplate;
}
Update 1) to show custom MessageGroupProcessor:
public class CustomAggregtingMessageGroupProcessor extends AbstractAggregatingMessageGroupProcessor {
#Override
protected final Object aggregatePayloads(MessageGroup group, Map<String, Object> headers) {
return group;
}
}
Example Service:
#Slf4j
public class SomeService {
#ServiceActivator
public void processMessages(MessageGroup messageGroup) throws IOException {
Collection<Message<?>> messages = messageGroup.getMessages();
//Do business logic
//ack messages in the group
for (Message<?> m : messages) {
com.rabbitmq.client.Channel channel = (com.rabbitmq.client.Channel)
m.getHeaders().get("amqp_channel");
long deliveryTag = (long) m.getHeaders().get("amqp_deliveryTag");
log.debug(" deliveryTag = {}",deliveryTag);
log.debug("Channel = {}",channel);
channel.basicAck(deliveryTag, false);
}
}
}
Updated integrationFlow
public IntegrationFlow aggregationFlowWithCustomMessageProcessor() {
return IntegrationFlows.from(inboundRabbitMessages()).transform(Transformers.fromJson(SomeObject.class))
.aggregate(a -> {
a.sendPartialResultOnExpiry(true);
a.groupTimeout(3000);
a.expireGroupsUponCompletion(true);
a.expireGroupsUponTimeout(true);
a.correlationExpression("T(Thread).currentThread().id");
a.releaseExpression("size() == 10");
a.transactional(true);
a.outputProcessor(new CustomAggregtingMessageGroupProcessor());
}).handle("someService", "processMessages").get();
}
New ErrorHandler to do nack
public class MyErrorHandler {
#ServiceActivator(inputChannel = "myErrorChannelHandler")
public void handleFailedMessage(MessageGroup messageGroup) throws IOException {
if(messageGroup!=null) {
log.debug("Nack messages size = {}", messageGroup.getMessages().size());
Collection<Message<?>> messages = messageGroup.getMessages();
for (Message<?> m : messages) {
com.rabbitmq.client.Channel channel = (com.rabbitmq.client.Channel)
m.getHeaders().get("amqp_channel");
long deliveryTag = (long) m.getHeaders().get("amqp_deliveryTag");
log.debug("deliveryTag = {}",deliveryTag);
log.debug("channel = {}",channel);
channel.basicNack(deliveryTag, false, false);
}
}
}
}
Update 2 Added custom ReleaseStratgedy and change to aggegator
public class CustomMeasureGroupReleaseStratgedy implements ReleaseStrategy {
private static final int MAX_MESSAGE_COUNT = 10;
public boolean canRelease(MessageGroup messageGroup) {
return messageGroup.getMessages().size() >= MAX_MESSAGE_COUNT;
}
}
public IntegrationFlow aggregationFlowWithCustomMessageProcessorAndReleaseStratgedy() {
return IntegrationFlows.from(inboundRabbitMessages()).transform(Transformers.fromJson(SomeObject.class))
.aggregate(a -> {
a.sendPartialResultOnExpiry(true);
a.groupTimeout(3000);
a.expireGroupsUponCompletion(true);
a.expireGroupsUponTimeout(true);
a.correlationExpression("T(Thread).currentThread().id");
a.transactional(true);
a.releaseStrategy(new CustomMeasureGroupReleaseStratgedy());
a.outputProcessor(new CustomAggregtingMessageGroupProcessor());
}).handle("someService", "processMessages").get();
}
There are some flaws in your understanding.If you use AUTO, only the last message will be dead-lettered when an exception occurs. Messages successfully deposited in the group, before the release, will be ack'd immediately.
The only way to achieve what you want is to use MANUAL acks.
There is no way to "tell the listener container to send messages to the DLQ". The container never sends messages to the DLQ, it rejects a message and the broker sends it to the DLX/DLQ.
I wish to batch and process items as they come along so i created a UnicastProcessor and subscribed to it like this
UnicastProcessor<String> processor = UnicastProcessor.create()
processor
.bufferTimeout(10, Duration.ofMillis(500))
.subscribe(new Subscriber<List<String>>() {
#Override
public void onSubscribe(Subscription subscription) {
System.out.println("OnSubscribe");
}
#Override
public void onNext(List<String> strings) {
System.out.println("OnNext");
}
#Override
public void onError(Throwable throwable) {
System.out.println("OnError");
}
#Override
public void onComplete() {
System.out.println("OnComplete");
}
});
And then for testing purposes i created a new thread and started adding items in a loop
new Thread(() -> {
int limit = 100
i = 0
while(i < limit) {
++i
processor.sink().next("Hello $i")
}
System.out.println("Published all")
}).start()
After running this (and letting the main thread sleep for 5 seconds) i can see that all item have been published, but the subscriber does not trigger on any of the events so i can't process any of the published items.
What am I doing wrong here?
Reactive Streams specification is the answer!
The total number of onNext´s signalled by a Publisher to a Subscriber
MUST be less than or equal to the total number of elements requested
by that Subscriber´s Subscription at all times. [Rule 1.1]
In your example, you just simply provide a subscriber who does nothing in any sense. In turn, Reactive Streams specification, directly says that nothing will happen (there will be no onNext invocation) if you have not called Subscription#request method
A Subscriber MUST signal demand via Subscription.request(long n) to
receive onNext signals. [Rule 2.1]
Thus, to fix your problem, one of the possible solutions is changing the code in the following way:
UnicastProcessor<String> processor = UnicastProcessor.create()
processor
.bufferTimeout(10, Duration.ofMillis(500))
.subscribe(new Subscriber<List<String>>() {
#Override
public void onSubscribe(Subscription subscription) {
System.out.println("OnSubscribe");
subscription.request(Long.MAX_VALUE);
}
#Override
public void onNext(List<String> strings) {
System.out.println("OnNext");
}
#Override
public void onError(Throwable throwable) {
System.out.println("OnError");
}
#Override
public void onComplete() {
System.out.println("OnComplete");
}
});
Note, in this example demand in size Long.MAX_VALUE means an unbounded demand so that all messages will be directly pushed to the given Subscriber [Rule 3.17]
Use UnicatProcessor correctly
On the one hand, your example will work correctly with mentioned fixes. However, on the other hand, each invocation of FluxProcessor#sink() (yeah sink is FluxProcessor's method) will lead to a redundant calling of UnicastProcessor's onSubscribe method, which under the hood cause a few atomic reads and writes which might be avoided if create FluxSink once and safely use it as many tame as needed. For example:
UnicastProcessor<String> processor = UnicastProcessor.create()
FluxSink<String> sink = processor.serialize().sink();
...
new Thread(() -> {
int limit = 100
i = 0
while(i < limit) {
++i
sink.next("Hello $i")
}
System.out.println("Published all")
}).start()
Note, in this example, I executed an additional method serialize which provide thread-safe sink and ensure that the calling of FluxSink#next concurrently will not cause a violation of the ReactiveStreams spec.
Could this be a correct way to dispatch A common topic information through browser client?
#RestController
public class GenvScriptHandler {
DirectProcessor<String> topicData = DirectProcessor.create();
FluxSink<String> sink;
int test;
#GetMapping(value = "/addTopic")
public void addTopic() {
if (sink == null) {
sink = topicData.sink();
}
sink.next(String.valueOf(test++));
}
#GetMapping(value = "/getTopic", produces = "text/event-stream")
public Flux<String> getTopic() {
Flux<String> autoConnect = topicData.publish().autoConnect();
return autoConnect;
}
}
As I Use a DirectProcessor there is no backpressure possible, I wonder how the flux is consumed when sending through sse. Does a subscriber can request less than the number element pushed in the flux?
http://projectreactor.io/docs/core/release/reference/#_directprocessor
As a consequence, a DirectProcessor signals an IllegalStateException to its subscribers if you push N elements through it but at least one of its subscribers has requested less than N.
Subscribing with a SSE request, does a
request(1) and not a request(Integer.MAX_VALUE)
So if I sink * 1000 times, the Processor OverLoad and an exception is thrown, even if it has subscribers:
reactor.core.Exceptions$OverflowException: Can't deliver value due to lack of requests
Safer to Use an EmitterProcessor or a ReplayProcessor in my case