Sping Boot Service consume kafka messages on demand - spring-boot

I have requirement where need to have a Spring Boot Rest Service that a client application will call every 30 minutes and service is to return
number of latest messages based on the number specified in query param e.g. http://messages.com/getNewMessages?number=10 in this case should return 10 messages
number of messages based on the number and offset specified in query param e.g. http://messages.com/getSpecificMessages?number=5&start=123 in this case should return 5 messages starting offset 123.
I have simple standalone application and it works fine. Here is what I tested and would lke some direction of incorporating it in the service.
public static void main(String[] args) {
// create kafka consumer
Properties properties = new Properties();
properties.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
properties.put(ConsumerConfig.GROUP_ID_CONFIG, "my-first-consumer-group");
properties.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
properties.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
properties.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
properties.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false);
properties.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, args[0]);
Consumer<String, String> consumer = new KafkaConsumer<>(properties);
// subscribe to topic
consumer.subscribe(Collections.singleton("test"));
consumer.poll(0);
//get to specific offset and get specified number of messages
for (TopicPartition partition : consumer.assignment())
consumer.seek(partition, args[1]);
ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(5000));
System.out.println("Total Record Count ******* : " + records.count());
for (ConsumerRecord<String, String> record : records) {
System.out.println("Message: " + record.value());
System.out.println("Message offset: " + record.offset());
System.out.println("Message: " + record.timestamp());
Date date = new Date(record.timestamp());
Format format = new SimpleDateFormat("yyyy MM dd HH:mm:ss.SSS");
System.out.println("Message date: " + format.format(date));
}
consumer.commitSync();
As my consumer will be on-demand wondering in Spring Boot Service how I can achieve this. Where do I specify the properties if I put in application.properties those get's injected at startup time but how do i control MAX_POLL_RECORDS_CONFIG at runtime. Any help appreciated.

MAX_POLL_RECORDS_CONFIG only impact your kafka-client return the records to your spring service, it will never reduce the bytes that the consumer poll from kafka-server
see the above picture, no matter your start offset = 150 or 190, kafka server will return the whole data from (offset=110, offset=190), kafka server even didn't know how many records return to consumer, he only know the byte size = (220 - 110)
so i think you can control the record number by yourself,currently it is controlled by the kafka client jar, they are both occupy your jvm local memory

The answer to your question is here and the answer with code example is this answer.
Both written by the excellent Gary Russell, the main or one of the main person behind Spring Kafka.
TL;DR:
If you want to arbitrarily rewind the partitions at runtime, have your
listener implement ConsumerSeekAware and grab a reference to the
ConsumerSeekCallback.

Related

is it possible to call a Microservice (spring boot) from a Camunda service Task?

Workflow -> (https://i.stack.imgur.com/vgtiD.png)
Is it possible to call a microservice from a Camunda task?
1.The start event will received a Json with a client data .
2.The service task should connect to a microservice (spring boot) that stores the data in the database.-> just need to pass the json with the info to the micro and then should complete the task.
3. if the previous task is completed this task should run.
is there a way to do it? I am very new at camunda.
External Task but it didnt work
Yes you can, check for documentation :
#Component
#ExternalTaskSubscription("scoreProvider") // create a subscription for this topic name
public class ProvideScoreHandler implements ExternalTaskHandler {
#Override
public void execute(ExternalTask externalTask, ExternalTaskService externalTaskService) {
// only for the sake of this demonstration, we generate random data
// in a real-world scenario, we would load the data from a database
String customerId = "C-" + UUID.randomUUID().toString().substring(32);
int creditScore = (int) (Math.random() * 11);
VariableMap variables = Variables.createVariables();
variables.put("customerId", customerId);
variables.put("creditScore", creditScore);
// complete the external task
externalTaskService.complete(externalTask, variables);
Logger.getLogger("scoreProvider")
.log(Level.INFO, "Credit score {0} for customer {1} provided!", new Object[]{creditScore, customerId});
}
}
Spring boot with Camunda example

Differences in total message length between spring JMS and native IBM MQ libraries

I send a simple text message to an MQ Queue (MQ 7.0.1):
"abc"
Using spring JMS the total length of the message is: 291
But putting the same message in the queue using IBM MQ libraries the total length of the message is: 3
How can I get total data length 3 with JMS?
Spring JMS code:
#EnableJms
public class JMSTestController {
...
#Autowired
private JmsTemplate jmsTemplate;
#Autowired
JmsMessagingTemplate jmsMessagingTemplate;
...
public String send() throws JMSException{
jmsTemplate.setReceiveTimeout(10000);
jmsMessagingTemplate.setJmsTemplate(jmsTemplate);
Session session = jmsMessagingTemplate.getConnectionFactory().createConnection()
.createSession(false, Session.AUTO_ACKNOWLEDGE);
Queue entryQueue = session.createQueue("hereQueueName");
Queue replyQueue = session.createQueue("hereReplyQueueName");
TextMessage message = session.createTextMessage("abc");
message.setJMSDeliveryMode(DeliveryMode.NON_PERSISTENT);
message.setJMSDestination(entryQueue);
message.setIntProperty(WMQConstants.JMS_IBM_CHARACTER_SET, 819);
message.setIntProperty(WMQConstants.JMS_IBM_ENCODING, 273);
jmsMessagingTemplate.convertAndSend(entryQueue, message);
String messageId = message.getJMSMessageID();
...
}
Native code:
MQQueueManager qm = createQueueManager(queueManager, host, port,
channel, username, password, connectionType);
MQQueue m_receiver = null;
MQMessage msg = new MQMessage();
msg.format = MQC.MQFMT_STRING;
msg.expiry = timeout / 1000;
msg.replyToQueueName = qReceiver;
msg.replyToQueueManagerName = queueManager;
msg.write("abc".getBytes());
MQPutMessageOptions pmo = new MQPutMessageOptions();
try {
qm.put(qSender, msg, pmo);
} catch (MQException e) {
MQTalkerException ex = new MQTalkerException(
"An error happened sending a message", e);
logger.error(ex);
throw ex;
}
Solution
Following JoshMc's comment I made the following modification and reached the expected result:
Check out these answers, you want to set targetClient to MQ to remove
those properties. There are many ways to accomplish this, changing
your CreateQueue to use a URI is probably the easiest.
JMS transport v/s MQ transport
That is, modify the creation of the queue using the URI instead of just its name.
Queue entryQueue = session.createQueue("queue:///QUEUE_NAME?targetClient=1");
I reached the solution by following JoshMc's comment. That is, modify the creation of the queue using the URI instead of just its name.
Queue entryQueue = session.createQueue("queue:///QUEUE_NAME?targetClient=1");
This removes the MQRFH2 header (the extra bytes I didn't know where they came from)
and with that the message has a total length of 3 bytes.
Spring is counting the bytes of the message body (aka data)
IBM MQ native is counting the bytes of the message headers plus body
In your screenshot, the field directly above shows '3' bytes.
Longitud dataos = length of body = 3
Longitud total = length of headers + body = 291

Listen to another message only when I am done with my current message in Kafka

I am building a Springboot application using Spring Kafka where I am getting messages from a topic. I have to modify those messages and then produce them to another topic. I don't want to consume any other message till I have processed my current one. How can I achieve this?
#KafkaListener(
topics = "${event.topic.name}",
groupId = "${event.topic.group.id}",
containerFactory = "eventKafkaListenerContainerFactory"
)
public void consume(Event event) {
logger.info(String.format("Event created(from consumer)-> %s", event));
}
"event" is a json object which I am receiving as a message.
See https://docs.confluent.io/platform/current/installation/configuration/consumer-configs.html#consumerconfigs_max.poll.records:
max.poll.records
The maximum number of records returned in a single call to poll().
Type: int
Default: 500
With Spring Boot you can configure it as this property:
spring.kafka.consumer.maxPollRecords
So, you set it to 1 and no more records are going to be polled from this consumer until you return from your #KafkaListener method.

Best strategy to Handle large data in Apache Camel

I am using Apache Camel to generate monthly reports. I have a MySQL query which when ran against my DB generates around 5 million records (20 columns each). The query itself takes approximately 70 minutes to execute.
To speed up the process, I created 5 seda (worker) routes and used multicast().parallelProcessing()
which query the DB in parallel for different time ranges, and then merged the result using an aggregator.
Now, I can see 5 million records in my exchange body (in the form of List<HashMap<String, Object>>). When I try to format this using a Camel Bindy to generate a csv file out of this data, I am getting a GC Overhead Exception. I tried increasing Java Heap Size, but it takes forever to transform.
Is there any other method, to convert this raw data into a well formatted csv file? Can Java 8 streams be useful?
Code
from("direct://logs/testLogs")
.routeId("Test_Logs_Route")
.setProperty("Report", simple("TestLogs-${date:now:yyyyMMddHHmm}"))
.bean(Logs.class, "buildLogsQuery") // bean that generates the logs query
.multicast()
.parallelProcessing()
.to("seda:worker1?waitForTaskToComplete=Always&timeout=0", // worker routes
"seda:worker2?waitForTaskToComplete=Always&timeout=0",
"seda:worker3?waitForTaskToComplete=Always&timeout=0",
"seda:worker4?waitForTaskToComplete=Always&timeout=0",
"seda:worker5?waitForTaskToComplete=Always&timeout=0");
All my worker routes look like this
from("seda:worker4?waitForTaskToComplete=Always")
.routeId("ParallelProcessingWorker4")
.log(LoggingLevel.INFO, "Parallel Processing Worker 4 Flow Started")
.setHeader("WorkerId", constant(4))
.bean(Logs.class, "testBean") // appends time-clause to the query based in WorkerID
.to("jdbc:oss-ro-ds")
.to("seda:resultAggregator?waitForTaskToComplete=Always&timeout=0");
Aggregator
from("seda:resultAggregator?waitForTaskToComplete=Always&timeout=0")
.routeId("Aggregator_ParallelProcessing")
.log(LoggingLevel.INFO, "Aggregation triggered for processor ${header.WorkerId}")
.aggregate(header("Report"), new ParallelProcessingAggregationStrategy())
.completionSize(5)
.to("direct://logs/processResultSet")
from("direct://logs/processResultSet")
.routeId("Process_Result_Set")
.bean(Test.class, "buildLogReport");
.marshal(myLogBindy)
.to("direct://deliver/ooma");
Method buildLogReport
public void buildLogReport(List<HashMap<String, Object>> resultEntries, Exchange exchange) throws Exception {
Map<String, Object> headerMap = exchange.getIn().getHeaders();
ArrayList<MyLogEntry> reportList = new ArrayList<>();
while(resultEntries != null){
HashMap<String, Object> resultEntry = resultEntries.get(0);
MyLogEntry logEntry = new MyLogEntry();
logEntry.setA((String) resultEntry.get("A"));
logEntry.setB((String) resultEntry.get("B"));
logEntry.setC(((BigDecimal) resultEntry.get("C")).toString());
if (null != resultEntry.get("D"))
logEntry.setD(((BigInteger) resultEntry.get("D")).toString());
logEntry.setE((String) resultEntry.get("E"));
logEntry.setF((String) resultEntry.get("F"));
logEntry.setG(((BigDecimal) resultEntry.get("G")).toString());
logEntry.setH((String) resultEntry.get("H"));
logEntry.setI(((Long) resultEntry.get("I")).toString());
logEntry.setJ((String) resultEntry.get("J"));
logEntry.setK(TimeUtils.convertDBToTZ((Date) resultEntry.get("K"), (String) headerMap.get("TZ")));
logEntry.setL(((BigDecimal) resultEntry.get("L")).toString());
logEntry.setM((String) resultEntry.get("M"));
logEntry.setN((String) resultEntry.get("State"));
logEntry.setO((String) resultEntry.get("Zip"));
logEntry.setP("\"" + (String) resultEntry.get("Type") + "\"");
logEntry.setQ((String) resultEntry.get("Gate"));
reportList.add(logEntry);
resultEntries.remove(resultEntry);
}
// Transform The Exchange Message
exchange.getIn().setBody(reportList);
}

JmsTemplate's browseSelected not retrieving all messages

I have some Java code that reads messages from an ActiveMQ queue. The code uses a JmsTemplate from Spring and I use the "browseSelected" method to retrieve any messages from the queue that have a timestamp in their header older than 7 days (by creating the appropriate criteria as part of the messageSelector parameter).
myJmsTemplate.browseSelected(myQueue, myCriteria, new BrowserCallback<Integer>() {
#Override
public Integer doInJms(Session s, QueueBrowser qb) throws JMSException {
#SuppressWarnings("unchecked")
final Enumeration<Message> e = qb.getEnumeration();
int count = 0;
while (e.hasMoreElements()) {
final Message m = e.nextElement();
final TextMessage tm = (TextMessage) MyClass.this.jmsQueueTemplate.receiveSelected(
MyClass.this.myQueue, "JMSMessageID = '" + m.getJMSMessageID() + "'");
myMessages.add(tm);
count++;
}
return count;
}
});
The BrowserCallback's "doInJms" method adds the messages which match the criteria to a list ("myMessages") which subsequently get processed further.
The issue is that I'm finding the code will only process 400 messages each time it runs, even though there are several thousand messages which match the criteria specified.
When I previously used another queueing technology with this code (IBM MQ), it would process all records which met the criteria.
I'm wondering whether I'm experiencing an issue with ActiveMQ's prefetch limit: http://activemq.apache.org/what-is-the-prefetch-limit-for.html
Versions: ActiveMQ 5.10.1 and Spring 3.2.2.
Thanks in advance for any assistance.
The broker will only return up to 400 message by default as configured by the maxBrowsePageSize option in the destination policies. You can increase that value but must use caution as the messages are paged into memory and as such can lead you into an OOM situation.
You must always remember that a message broker is not a database, using it as one will generally end in tears.

Resources