I dont understand why the store "statStore" is empty in that topology :
StreamsBuilder builder = new StreamsBuilder();
KTable<String, PersonAvro> person$ = builder.table("sc_loc_person_priv_v4");
person$.mapValues(person -> CustomerAvro.newBuilder()
.setId(person.getId())
.setCompName(person.getCompName())
.setSiretCode(person.getSiretCode())
.setActive(person.getActive())
.setAdd3(person.getAdd3())
.setAdd4(person.getAdd4())
.setAdd5(person.getAdd5())
.setAdd6(person.getAdd6())
.setAdd7(person.getAdd7()))
.mapValues(CustomerAvro.Builder::build)
.toStream()
.peek((key, personne) -> {
journal.debug("{}, {}", key, personne);
})
.through("sc_loc_customer_priv_v1")
.groupBy((String key, CustomerAvro customer) -> customer.getActive())
.count(Materialized.as("statStore"));
Topology topology = builder.build();
KafkaStreams streams = new KafkaStreams(topology, props);
streams.start();
store = waitUntilStoreIsQueryable("statStore", streams);
Naturaly, i putted messages in the topic sc_loc_person_priv_v4.
If i simply do that topology, the store is not empty:
#PostConstruct
private void init() throws InterruptedException {
configurer();
producer = new KafkaProducer<>(props);
StreamsBuilder builder = new StreamsBuilder();
builder.globalTable(kafkaTopic, Materialized.as("customerStore"));
Topology topology = builder.build();
KafkaStreams streams = new KafkaStreams(topology, props);
streams.start();
store = waitUntilStoreIsQueryable("customerStore", streams);
}
public long getNb() {
return store.approximateNumEntries();
}
Related
There seem to be an issue when I use AggregatingReplyingKafkaTemplate with template.setReturnPartialOnTimeout(true) in that, it returns timeout exception even if partial results are available from consumers.
In example below, I have 3 consumers to reply to the request topic and i've set the reply timeout at 10 seconds. I've explicitly delayed the response of Consumer 3 to 11 seconds, however, I expect the response back from Consumer 1 and 2, so, I can return partial results. However, I am getting KafkaReplyTimeoutException. Appreciate your inputs. Thanks.
I follow the code based on the Unit Test below.
[ReplyingKafkaTemplateTests][1]
I've provided the actual code below:
#RestController
public class SumController {
#Value("${kafka.bootstrap-servers}")
private String bootstrapServers;
public static final String D_REPLY = "dReply";
public static final String D_REQUEST = "dRequest";
#ResponseBody
#PostMapping(value="/sum")
public String sum(#RequestParam("message") String message) throws InterruptedException, ExecutionException {
AggregatingReplyingKafkaTemplate<Integer, String, String> template = aggregatingTemplate(
new TopicPartitionOffset(D_REPLY, 0), 3, new AtomicInteger());
String resultValue ="";
String currentValue ="";
try {
template.setDefaultReplyTimeout(Duration.ofSeconds(10));
template.setReturnPartialOnTimeout(true);
ProducerRecord<Integer, String> record = new ProducerRecord<>(D_REQUEST, null, null, null, message);
RequestReplyFuture<Integer, String, Collection<ConsumerRecord<Integer, String>>> future =
template.sendAndReceive(record);
future.getSendFuture().get(5, TimeUnit.SECONDS); // send ok
System.out.println("Send Completed Successfully");
ConsumerRecord<Integer, Collection<ConsumerRecord<Integer, String>>> consumerRecord = future.get(10, TimeUnit.SECONDS);
System.out.println("Consumer record size "+consumerRecord.value().size());
Iterator<ConsumerRecord<Integer, String>> iterator = consumerRecord.value().iterator();
while (iterator.hasNext()) {
currentValue = iterator.next().value();
System.out.println("response " + currentValue);
System.out.println("Record header " + consumerRecord.headers().toString());
resultValue = resultValue + currentValue + "\r\n";
}
} catch (Exception e) {
System.out.println("Error Message is "+e.getMessage());
}
return resultValue;
}
public AggregatingReplyingKafkaTemplate<Integer, String, String> aggregatingTemplate(
TopicPartitionOffset topic, int releaseSize, AtomicInteger releaseCount) {
//Create Container Properties
ContainerProperties containerProperties = new ContainerProperties(topic);
containerProperties.setAckMode(ContainerProperties.AckMode.MANUAL_IMMEDIATE);
//Set the consumer Config
//Create Consumer Factory with Consumer Config
DefaultKafkaConsumerFactory<Integer, Collection<ConsumerRecord<Integer, String>>> cf =
new DefaultKafkaConsumerFactory<>(consumerConfigs());
//Create Listener Container with Consumer Factory and Container Property
KafkaMessageListenerContainer<Integer, Collection<ConsumerRecord<Integer, String>>> container =
new KafkaMessageListenerContainer<>(cf, containerProperties);
// container.setBeanName(this.testName);
AggregatingReplyingKafkaTemplate<Integer, String, String> template =
new AggregatingReplyingKafkaTemplate<>(new DefaultKafkaProducerFactory<>(producerConfigs()), container,
(list, timeout) -> {
releaseCount.incrementAndGet();
return list.size() == releaseSize;
});
template.setSharedReplyTopic(true);
template.start();
return template;
}
public Map<String, Object> consumerConfigs() {
Map<String, Object> props = new HashMap<>();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG,bootstrapServers);
props.put(ConsumerConfig.GROUP_ID_CONFIG, "test_id");
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, org.apache.kafka.common.serialization.StringDeserializer.class);
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, org.apache.kafka.common.serialization.StringDeserializer.class);
return props;
}
public Map<String, Object> producerConfigs() {
Map<String, Object> props = new HashMap<>();
// list of host:port pairs used for establishing the initial connections to the Kakfa cluster
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG,
bootstrapServers);
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG,
org.apache.kafka.common.serialization.StringSerializer.class);
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, org.apache.kafka.common.serialization.StringSerializer.class);
return props;
}
public ProducerFactory<Integer,String> producerFactory() {
return new DefaultKafkaProducerFactory<>(producerConfigs());
}
#KafkaListener(id = "def1", topics = { D_REQUEST}, groupId = "D_REQUEST1")
#SendTo // default REPLY_TOPIC header
public String dListener1(String in) throws InterruptedException {
return "First Consumer : "+ in.toUpperCase();
}
#KafkaListener(id = "def2", topics = { D_REQUEST}, groupId = "D_REQUEST2")
#SendTo // default REPLY_TOPIC header
public String dListener2(String in) throws InterruptedException {
return "Second Consumer : "+ in.toLowerCase();
}
#KafkaListener(id = "def3", topics = { D_REQUEST}, groupId = "D_REQUEST3")
#SendTo // default REPLY_TOPIC header
public String dListener3(String in) throws InterruptedException {
Thread.sleep(11000);
return "Third Consumer : "+ in;
}
}
'''
[1]: https://github.com/spring-projects/spring-kafka/blob/master/spring-kafka/src/test/java/org/springframework/kafka/requestreply/ReplyingKafkaTemplateTests.java
template.setReturnPartialOnTimeout(true) simply means the template will consult the release strategy on timeout (with the timeout argument = true, to tell the strategy it's a timeout rather than a delivery call).
It must return true to release the partial result.
This is to allow you to look at (and possibly modify) the list to decide whether you want to release or discard.
Your strategy ignores the timeout parameter:
(list, timeout) -> {
releaseCount.incrementAndGet();
return list.size() == releaseSize;
});
You need return timeout ? true : { ... }.
I have a requirement to fetch timestamp (event-time) when the message was produced, in the kafka consumer application. I am aware of the timestampExtractor, which can be used with kafka stream , but my requirement is different as I am not using stream to consume message.
My kafka producer is as follows :
#Override
public void run(ApplicationArguments args) throws Exception {
List<String> names = Arrays.asList("priya", "dyser", "Ray", "Mark", "Oman", "Larry");
List<String> pages = Arrays.asList("blog", "facebook", "instagram", "news", "youtube", "about");
Runnable runnable = () -> {
String rPage = pages.get(new Random().nextInt(pages.size()));
String rName = pages.get(new Random().nextInt(names.size()));
PageViewEvent pageViewEvent = new PageViewEvent(rName, rPage, Math.random() > .5 ? 10 : 1000);
Message<PageViewEvent> message = MessageBuilder
.withPayload(pageViewEvent).
setHeader(KafkaHeaders.MESSAGE_KEY, pageViewEvent.getUserId().getBytes())
.build();
try {
this.pageViewsOut.send(message);
log.info("sent " + message);
} catch (Exception e) {
log.error(e);
}
};
Kafka Consumer is implemented using Spring kafka #KafkaListener.
#KafkaListener(topics = "test1" , groupId = "json", containerFactory = "kafkaListenerContainerFactory")
public void receive(#Payload PageViewEvent data,#Headers MessageHeaders headers) {
LOG.info("Message received");
LOG.info("received data='{}'", data);
}
Container factory configuration
#Bean
public ConsumerFactory<String,PageViewEvent > priceEventConsumerFactory() {
Map<String, Object> props = new HashMap<>();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, JsonDeserializer.class);
props.put(ConsumerConfig.GROUP_ID_CONFIG, "json");
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
return new DefaultKafkaConsumerFactory<>(props, new StringDeserializer(), new JsonDeserializer<>(PageViewEvent.class));
}
#Bean
public ConcurrentKafkaListenerContainerFactory<String, PageViewEvent> priceEventsKafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, PageViewEvent> factory =
new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(priceEventConsumerFactory());
return factory;
}
The producer, which is sending the message when I print give me below data :
[payload=PageViewEvent(userId=blog, page=about, duration=10),
headers={id=8ebdad85-e2f7-958f-500e-4560ac0970e5,
kafka_messageKey=[B#71975e1a, contentType=application/json,
timestamp=1553041963803}]
This does have a produced timestamp. How can I fetch the message produced time stamp with Spring kafka?
RECEIVED_TIMESTAMP means it is the time stamp from the record that was received not the time it was received.. We avoid putting it in TIMESTAMP to avoid inadvertent propagation to an outbound message.
You can use something like below:
final Producer<String, String> producer = new KafkaProducer<String, String>(properties);
long time = System.currentTimeMillis();
final CountDownLatch countDownLatch = new CountDownLatch(5);
int count=0;
try {
for (long index = time; index < time + 10; index++) {
String key = null;
count++;
if(count<=5)
key = "id_"+ Integer.toString(1);
else
key = "id_"+ Integer.toString(2);
final ProducerRecord<String, String> record =
new ProducerRecord<>(TOPIC, key, "B2B Sample Message: " + count);
producer.send(record, (metadata, exception) -> {
long elapsedTime = System.currentTimeMillis() - time;
if (metadata != null) {
System.out.printf("sent record(key=%s value=%s) " +
"meta(partition=%d, offset=%d) time=%d timestamp=%d\n",
record.key(), record.value(), metadata.partition(),
metadata.offset(), elapsedTime, metadata.timestamp());
System.out.println("Timestamp:: "+metadata.timestamp() );
} else {
exception.printStackTrace();
}
countDownLatch.countDown();
});
}
try {
countDownLatch.await(25, TimeUnit.SECONDS);
} catch (InterruptedException e) {
e.printStackTrace();
}
}finally {
producer.flush();
producer.close();
}
}
In spring boot 1.4.2.RELEASE, PropertiesConfigurationFactory has method setProperties(Properties properties).
So my code can write as follow:
public Map<String, CacheProperties> resolve() {
Map<String, Object> cacheSettings = resolveCacheSettings();
Set<String> beanNames = resolveCacheManagerBeanNames(cacheSettings);
Map<String, CacheProperties> cachePropertiesMap = new HashMap<>(beanNames.size());
MutablePropertySources propertySources = new MutablePropertySources();
propertySources.addFirst(new MapPropertySource("cache", cacheSettings));
beanNames.forEach(beanName -> {
CacheProperties cacheProperties = new CacheProperties();
PropertiesConfigurationFactory<CacheProperties> factory = new PropertiesConfigurationFactory<>(cacheProperties);
Properties properties = new Properties();
properties.putAll(PropertySourceUtils.getSubProperties(propertySources, beanName));
factory.setProperties(properties);
factory.setConversionService(environment.getConversionService());
try {
factory.bindPropertiesToTarget();
cachePropertiesMap.put(beanName, cacheProperties);
} catch (Exception e) {
throw new RuntimeException(e);
}
});
return cachePropertiesMap;
}
But when I upgrade to spring boot 1.5.8.RELEASE. PropertiesConfigurationFactory changed the method setProperties(Properties properties) to setPropertySources(PropertySources propertySources).
So I changed my code like this:
public Map<String, CacheProperties> resolve() {
Map<String, Object> cacheSettings = resolveCacheSettings();
Set<String> beanNames = resolveCacheManagerBeanNames(cacheSettings);
Map<String, CacheProperties> cachePropertiesMap = new HashMap<>(beanNames.size());
MutablePropertySources propertySources = new MutablePropertySources();
propertySources.addFirst(new MapPropertySource("cache", cacheSettings));
beanNames.forEach(beanName -> {
CacheProperties cacheProperties = new CacheProperties();
PropertiesConfigurationFactory<CacheProperties> factory = new PropertiesConfigurationFactory<>(cacheProperties);
//Properties properties = new Properties();
//properties.putAll(PropertySourceUtils.getSubProperties(propertySources, beanName));
//factory.setProperties(properties);
factory.setPropertySources(propertySources);
factory.setConversionService(environment.getConversionService());
try {
factory.bindPropertiesToTarget();
cachePropertiesMap.put(beanName, cacheProperties);
} catch (Exception e) {
throw new RuntimeException(e);
}
});
return cachePropertiesMap;
}
But this not work...
Anyone can help me? How to convert Properties to PropertySources?
Resolved...
I add factory.setTargetName(beanName); before factory.setPropertySources(propertySources);, and it works well.
I have to send a message to 2 different queues(queue1 and queue2). However, i want to rollback, if the send is failed for any of the queue(queue1 or queue2).
my source code looks as follows. can anyone through some inputs on this.
public void sendMessage(final Map<String, String> mapMessage) {
jmsTemplate.send(queue1, session -> {
MapMessage message = session.createMapMessage();
Iterator<Entry<String, String>> it = mapMessage.entrySet().iterator();
while (it.hasNext()) {
Map.Entry<String, String> pair = it.next();
message.setStringProperty(pair.getKey(), pair.getValue());
}
message.setJMSRedelivered(true);
message.setJMSCorrelationID(UUID.randomUUID().toString().replaceAll("-", ""));
return message;
});
jmsTemplate.send(queue2, session -> {
MapMessage message = session.createMapMessage();
Iterator<Entry<String, String>> it = mapMessage.entrySet().iterator();
while (it.hasNext()) {
Map.Entry<String, String> pair = it.next();
message.setStringProperty(pair.getKey(), pair.getValue());
}
message.setJMSRedelivered(true);
message.setJMSCorrelationID(UUID.randomUUID().toString().replaceAll("-", ""));
return message;
});
}
Start a transaction before entering the sendMessage method, e.g. with #Transactional - see the Spring Framework Reference Manual.
I am trying to Streaming the data from Kafka to Spark
JavaPairInputDStream<String, String> directKafkaStream = KafkaUtils.createDirectStream(ssc,
String.class,
String.class,
StringDecoder.class,
StringDecoder.class,
kafkaParams, topics);
Here i am iterating over the JavaPairInputDStream to process the RDD's.
directKafkaStream.foreachRDD(rdd ->{
rdd.foreachPartition(items ->{
while (items.hasNext()) {
String[] State = items.next()._2.split("\\,");
System.out.println(State[2]+","+State[3]+","+State[4]+"--");
};
});
});
I can able to fetch the data in foreachRDD and my requirement is have to access State Array globally. When i am trying to access the State Array globally i am getting Exception
java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
Any suggestions ? Thanks.
This is more of a joining your lookup table with streaming RDD to get all the items that have a matching 'code' and 'violationCode' fields.
The flow should be like this.
Create an RDD of Hive lookup table => lookupRdd
Create DStream from kafka stream
For each RDD in Dstream, join lookupRDD with streamRdd, process the joined items(calculate sum of amount...) and save this processed result.
Note Below code is incomplete. Please complete all the TODO comments.
JavaPairDStream<String, String> streamPair = directKafkaStream.mapToPair(new PairFunction<Tuple2<String, String>, String, String>() {
#Override
public Tuple2<String, String> call(Tuple2<String, String> tuple2) throws Exception {
System.out.println("Tuple2 Message is----------" + tuple2._2());
String[] state = tuple2._2.split("\\,");
return new Tuple2<>(state[4], tuple2._2()); //pair <ViolationCode, data>
}
});
streamPair.foreachRDD(new Function<JavaPairRDD<String, String>, Void>() {
JavaPairRDD<String, String> hivePairRdd = null;
#Override
public Void call(JavaPairRDD<String, String> stringStringJavaPairRDD) throws Exception {
if (hivePairRdd == null) {
hivePairRdd = initHiveRdd();
}
JavaPairRDD<String, Tuple2<String, String>> joinedRdd = stringStringJavaPairRDD.join(hivePairRdd);
System.out.println(joinedRdd.take(10));
//todo process joinedRdd here and save the results.
joinedRdd.count(); //to trigger an action
return null;
}
});
}
public static JavaPairRDD<String, String> initHiveRdd() {
JavaRDD<String> hiveTableRDD = null; //todo code to create RDD from hive table
JavaPairRDD<String, String> hivePairRdd = hiveTableRDD.mapToPair(new PairFunction<String, String, String>() {
#Override
public Tuple2<String, String> call(String row) throws Exception {
String code = null; //TODO process 'row' and get 'code' field
return new Tuple2<>(code, row);
}
});
return hivePairRdd;
}