AMQP Appender pending message count - spring

We are sending audit log messages to a RabbitMQ cluster which is sometimes unavailable for reasons we cannot influence.
When the queue is not available, log messages start to accumulate locally and we get a out-of-memory eventually on the client.
We are using a AMQP Appender to submit our messages.
Is there a way we can query the count of pending log messages and raise an alert when messages start adding up?

Well, it isn't possible. There is just no any hooks to do that.
You can consider, though, to decrease maxSenderRetries from default 30 to 1 or 2. After that you'll start to lose log messages:
int retries = event.incrementRetries();
if (retries < AmqpAppender.this.maxSenderRetries) {
// Schedule a retry based on the number of times I've tried to re-send this
AmqpAppender.this.retryTimer.schedule(new TimerTask() {
#Override
public void run() {
AmqpAppender.this.events.add(event);
}
}, (long) (Math.pow(retries, Math.log(retries)) * 1000));
}
else {
addError("Could not send log message " + logEvent.getMessage()
+ " after " + AmqpAppender.this.maxSenderRetries + " retries", e);
}
We might have to expose queueSize option instead of default:
public LinkedBlockingQueue() {
this(Integer.MAX_VALUE);
}
Feel free to raise a JIRA on the matter.

Related

Orderly message send back %RETRY%CONSUMERGROUP

I have a question about the method of RocektMQ ConsumeMessageOrderlyService.sendMessageBack, The method comment:max reconsume times exceeded then send to dead letter queue。
But the message was actually sent to %RETYE%ConsumerGroup, this means that the message will then be consumed by the same group of consumers,and I have tried, the message was indeed sent first to SCHEDULE_TOPIC_XXXX, then delivered to %RETRY%ConsumerGroup, and never sent to Dead-Letter Queue.
this is the source code of ConsumeMessageOrderlyService.sendMessageBack:
public boolean sendMessageBack(final MessageExt msg) {
try {
// max reconsume times exceeded then send to dead letter queue.
Message newMsg = new Message(MixAll.getRetryTopic(this.defaultMQPushConsumer.getConsumerGroup()), msg.getBody());
MessageAccessor.setProperties(newMsg, msg.getProperties());
String originMsgId = MessageAccessor.getOriginMessageId(msg);
MessageAccessor.setOriginMessageId(newMsg, UtilAll.isBlank(originMsgId) ? msg.getMsgId() : originMsgId);
newMsg.setFlag(msg.getFlag());
MessageAccessor.putProperty(newMsg, MessageConst.PROPERTY_RETRY_TOPIC, msg.getTopic());
MessageAccessor.setReconsumeTime(newMsg, String.valueOf(msg.getReconsumeTimes()));
MessageAccessor.setMaxReconsumeTimes(newMsg, String.valueOf(getMaxReconsumeTimes()));
MessageAccessor.clearProperty(newMsg, MessageConst.PROPERTY_TRANSACTION_PREPARED);
newMsg.setDelayTimeLevel(3 + msg.getReconsumeTimes());
this.defaultMQPushConsumer.getDefaultMQPushConsumerImpl().getmQClientFactory().getDefaultMQProducer().send(newMsg);
return true;
} catch (Exception e) {
log.error("sendMessageBack exception, group: " + this.consumerGroup + " msg: " + msg.toString(), e);
}
return false;
}
I am confused about it.

SSE connection keeps failing every 5 minutes

I'm exposing a simple SSE endpoint using the SseEmitter Spring API, persisting all the emitters in a ConcurrentHashMap. The timeout for each emitter is set to 24 hours. Every 10 seconds I'm sending a message to all the clients. Clients are subscribed with native EventSource implementation, listening for events of particular name.
Unfortunately, I've noticed that every 5 minutes the connection is lost and reestablished again - even though timeout of emitter was explicitly set to 24 hours. Client's part does log it as an error, however on server side there's nothing. The issue occurs on both Tomcat and Jetty. I'd like to keep the session open without any interruptions, so resetting the connection every 5 minutes is unacceptable. Any ideas why this could be happening?
#RestController
#RequestMapping("api/v1/sse")
class SseController {
private val emitters = ConcurrentHashMap<String, SseEmitter>()
#GetMapping
fun initConnection(#RequestParam token: String): SseEmitter {
logger.info { "Init connection from $token" }
val emitter = SseEmitter(24 * 60 * 60 * 1000)
emitter.onCompletion {
logger.info { "Completion" }
emitters.remove(token)
}
emitter.onTimeout { logger.info { "Timeout " } }
emitter.onError { logger.error(it) { "Error" } }
emitters[token] = emitter
return emitter
}
#Scheduled(fixedRate = 10000)
fun send() {
emitters.forEach { (k, v) ->
logger.info { "Sending message to $k" }
v.send(
SseEmitter.event()
.id(UUID.randomUUID().toString())
.name("randomEvent")
.data("some data")
)
}
}
}
const eventSource = new EventSource(url);
eventSource.addEventListener('randomEvent', (e) =>
console.log(e.data)
);
eventSource.onerror = (e) => console.log(e);
Alright, seems it was an issue with Stackblitz's service worker. I've just implemented the same client-side solution in Chrome's plain console and the disconnecting is no longer happening.

Tuning Kafka for latency, packet loss, and unreachability

I am trying to optimize the performances of Kafka in a scenario where there is high latency (>500ms) and intermittent packet loss. I am working with JAVA and using 'kafka_2.13', version: '2.5.0' API. I have 24
nodes connected to a single broker, each node tries to send a small message to all the other subscribers. I observe that all nodes are able to communicate when there is no packet loss nor latency but they don't seem to be able to communicate soon after I add latency and packet loss. I will do more tests on Monday but I was wondering if anyone had any suggestions on possible configuration improvements.
Following you can see the code that I see to publish and receive messages and then the different configurations that used for consumers and producers.
Publishers:
boolean sendAsyncMessage (byte[] value, String topic) {
ProducerRecord<Long, byte[]> record = new ProducerRecord<> (topic, System.currentTimeMillis (), value);
long msStart = System.currentTimeMillis ();
producer.send (record, (metadata, exception) -> {
long msDelta = System.currentTimeMillis () - msStart;
logger.info ("Message with topic {} sent at {}, was ack after {}", topic, msStart, msDelta);
if (metadata == null) {
logger.info ("An exception was triggered during send:" + exception.toString ());
}
});
producer.flush ();
return true;
}
Subscribers:
while (keepGoing.get ()) {
try {
// java example do it every time!
subscribe ();
ConsumerRecords<Long, byte[]> consumerRecords = consumer.poll (Duration.ofMillis (2000));
manageMessage (consumerRecords);
//Thread processRecords = new Thread (() -> manageMessage (consumerRecords));
//processRecords.start ();
} catch (Exception e) {
logger.error ("Problem in polling: " + e.toString ());
}
}
Producer:
properties.put (ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, KafkaBroker.KEY_SERIALIZER);
properties.put (ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, KafkaBroker.VALUE_SERIALIZER);
properties.put (ProducerConfig.ACKS_CONFIG, reliable ? "1" : 0);
// host1:port1,host2:port2
properties.put (ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, server);
// how many bytes to buffer records waiting to be sent to the server
//properties.put (ProducerConfig.BUFFER_MEMORY_CONFIG, 33554432);
properties.put (ProducerConfig.COMPRESSION_TYPE_CONFIG, "gzip");
properties.put (ProducerConfig.CLIENT_ID_CONFIG, clientID);
//15 MINUTES
properties.put (ProducerConfig.CONNECTIONS_MAX_IDLE_MS_CONFIG, 54000000);
// MAX UNCOMPRESSED MESSAGE SIZE
// properties.put (ProducerConfig.MAX_REQUEST_SIZE_CONFIG, 1048576);
// properties.put (ProducerConfig.RECONNECT_BACKOFF_MAX_MS_CONFIG, 1000);
properties.put (ProducerConfig.RECONNECT_BACKOFF_MS_CONFIG, 300);
properties.put (ProducerConfig.RETRY_BACKOFF_MS_CONFIG, 300);
Consumer
properties.put (ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, KafkaBroker.KEY_DESERIALIZER);
properties.put (ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, KafkaBroker.VALUE_DESERIALIZER);
// host1:port1,host2:port2
properties.put (ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, server);
// should be the topic
properties.put (ConsumerConfig.GROUP_ID_CONFIG, groupID);
properties.put (ConsumerConfig.HEARTBEAT_INTERVAL_MS_CONFIG, "3000");
properties.put (ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, "30000");
properties.put (ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG, "1000");
Before trying to change all settings, I'd make a few changes in your logic:
Producer
Currently you are calling flush() after sending each message, effectively doing a synchronous send. This is not recommended as it forces the Kafka client to make a request to the cluster for every single message. This is pretty inefficient. In most cases, it's best to let the client decide when to actually send messages and not use flush().
Consumer
In each iteration, you are calling subscribe(), this is not needed. You should only call subscribe() when you want to change the subscription. Also creating a new thread in each poll() loop is not recommended! In addition to being slow, you risk creating hundreds or thousands of threads if the consumer starts fetching large amounts of messages.
Kafka is using a TCP protocol, so packets lost should be automatically retried. By default, Kafka clients are configured to retry most operations and automatically reconnect to brokers if a connection is lost.
When doing your tests, before changing configurations, you should see how the Kafka client is behaving by monitoring its metrics and logs. Are timeouts reached because of the latency? Are messages retried?
In the end, the biggest factor that was preventing my distributed system to correctly communicate was the producer acks option. In the beginning, we had set this to all (the strictest option) and it seems that that, paired with the deteriorated network was preventing Kafka to have performances similar to other TCP based protocols. We now use 0 for unreliable messages and 1 for reliable.

Spring Data Redis Streams, Cannot figure out what is happening to my unacknowleded messages?

I am using the following code to consume a Redis stream using a Spring Data Redis consumer group, but even though I have commented out the acknowledge command, my messages are not re-read after a server restart.
I would expect that if I didn't acknowledge the message, it should be re-read when the server gets killed and restarted. What am I missing here?
#Bean
#Autowired
public StreamMessageListenerContainer eventStreamPersistenceListenerContainerTwo(RedisConnectionFactory streamRedisConnectionFactory, RedisTemplate streamRedisTemplate) {
StreamMessageListenerContainer.StreamMessageListenerContainerOptions<String, MapRecord<String, String, String>> containerOptions = StreamMessageListenerContainer.StreamMessageListenerContainerOptions
.builder().pollTimeout(Duration.ofMillis(100)).build();
StreamMessageListenerContainer<String, MapRecord<String, String, String>> container = StreamMessageListenerContainer.create(streamRedisConnectionFactory,
containerOptions);
container.receive(Consumer.from("my-group", "my-consumer"),
StreamOffset.create("event-stream", ReadOffset.latest()),
message -> {
System.out.println("MessageId: " + message.getId());
System.out.println("Stream: " + message.getStream());
System.out.println("Body: " + message.getValue());
//streamRedisTemplate.opsForStream().acknowledge("my-group", message);
});
container.start();
return container;
}
After reading the Redis documentation on how streams work, I came up with the following to automatically process any unacknowledged but previously delivered messages for the consumer:
// Check for any previously unacknowledged messages that were delivered to this consumer.
log.info("STREAM - Checking for previously unacknowledged messages for " + this.getClass().getSimpleName() + " event stream listener.");
String offset = "0";
while ((offset = processUnacknowledgedMessage(offset)) != null) {
log.info("STREAM - Finished processing one unacknowledged message for " + this.getClass().getSimpleName() + " event stream listener: " + offset);
}
log.info("STREAM - Finished checking for previously unacknowledged messages for " + this.getClass().getSimpleName() + " event stream listener.");
And the method that processes the messages:
/**
* Processes, and acknowledges the next previously delivered message, beginning
* at the given message id offset.
*
* #param offset The last read message id offset.
* #return The message that was just processed, or null if there are no more messages.
*/
public String processUnacknowledgedMessage(String offset) {
List<MapRecord> messages = streamRedisTemplate.opsForStream().read(Consumer.from(groupName(), consumerName()),
StreamReadOptions.empty().noack().count(1),
StreamOffset.create(streamKey(), ReadOffset.from(offset)));
String lastMessageId = null;
for (MapRecord message : messages) {
if (log.isDebugEnabled()) log.debug(String.format("STREAM - Processing event(%s) from stream(%s) during startup: %s", message.getId(), message.getStream(), message.getValue()));
processRecord(message);
if (log.isDebugEnabled()) log.debug(String.format("STREAM - Finished processing event(%s) from stream(%s) during startup.", message.getId(), message.getStream()));
streamRedisTemplate.opsForStream().acknowledge(groupName(), message);
lastMessageId = message.getId().getValue();
}
return lastMessageId;
}

google-cloud pubsub leaves messages in queue after sending ack

I have this subscriber code:
try {
//subscriber
syncSubscriber.createSubscriber(SdkServiceConfig.s.SUBSCRIPTION_NAME_PARTNER_REQUEST);
final List<ReceivedMessage> messages = syncSubscriber.fetch(10, true);//get all current messages.
List<String> ackIds = new ArrayList<>();
for (ReceivedMessage message : messages) {
requestToCofmanSender.receiveMessage(message.getMessage());
ackIds.add(message.getAckId());
}
//preferred bulk ack, due to network performance
syncSubscriber.sendAck(ackIds);
requestToCofmanSender.getWazePublisher().shutdown();
}
and
public void sendAck(Collection<String> ackIdList) {
if (ackIdList != null && ackIdList.size() != 0) {
String subscriptionName = SubscriptionName.format(this.getProjectId(), this.subscriptionId);
AcknowledgeRequest acknowledgeRequest = AcknowledgeRequest.newBuilder().setSubscription(subscriptionName).addAllAckIds(ackIdList).build();
this.subscriber.acknowledgeCallable().call(acknowledgeRequest);
}
}
I poll the pubsub queue in loop
and even though the code sends ack i still get the same messages.
how should i ack otherwise?
My problem was that i had a break point between receiving the message and sending ack. My pubsub was configured to 10 seconds timeout.

Resources