InboundChannelAdapter picks up the same file several times from s3 - spring

I have InboundChannelAdapter configured with S3StreamingMessageSource.
I forced Poller to use taskExecutor with only 1 thread. But I see the same file is being picked up by the same thread 3 times with 3-4 seconds interval. Even though poller interval is 10 seconds. I've specified Composite filter which consists of pattern filter and acceptoncefilter. But no result, file is always picked up 3 times.
String prefix = "some_prefix";
String channel = "some_channel"
Pattern filePattern = Pattern.compile(
"^" + prefix + "some_file_name_pattern");
#Bean
#InboundChannelAdapter(value = channel,
poller = #Poller(fixedDelay = "10000", taskExecutor = "threadPoolTaskExecutor"))
public MessageSource<InputStream> createS3InboundStreamingMessageSource() {
S3StreamingMessageSource messageSource = new S3StreamingMessageSource(template());
messageSource.setRemoteDirectory(bucketName);
CompositeFileListFilter<S3ObjectSummary> compositeFileListFilter = new ChainFileListFilter<>();
compositeFileListFilter.addFilter(new S3PersistentAcceptOnceFileListFilter(
new SimpleMetadataStore(), prefix));
compositeFileListFilter.addFilter(new S3RegexPatternFileListFilter(filePattern));
messageSource.setFilter(compositeFileListFilter);
return messageSource;
}
#Transformer(inputChannel = channel,"another_channel")
public Message<S3ObjectInputStream> enrich(Message<S3ObjectInputStream> s3ObjectInputStreamMessage) {
S3ObjectInputStream s3ObjectInputStream = s3ObjectInputStreamMessage.getPayload();
URI zipUri = s3ObjectInputStream.getHttpRequest().getURI();
LOGGER.info("Picking up file : {}", zipUri.getPath());
...
}
private S3RemoteFileTemplate template() {
S3SessionFactory sessionFactory = new S3SessionFactory(amazonS3);
return new S3RemoteFileTemplate(sessionFactory);
}
#Bean
public TaskExecutor threadPoolTaskExecutor() {
ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
executor.setMaxPoolSize(1);
executor.setThreadNamePrefix("single_thread_task_executor");
executor.initialize();
return executor;
}
I see that the app comes to #Transformer 3 times. Would really appreciate any help.

Related

Create multiple beans of SftpInboundFileSynchronizingMessageSource dynamically with InboundChannelAdapter

I am using spring inbound channel adapter to poll files from sftp server. Application needs to poll from multiple directories from single sftp server. Since Inbound channel adapter does not allow to poll multiple directories I tried creating multiple beans of same type with different values. Since number of directories can increase in future, I want to control it from application properties and want to register beans dynamically.
My code -
#Override
public void postProcessBeanFactory(ConfigurableListableBeanFactory beanFactory) throws BeansException {
beanFactory.registerSingleton("sftpSessionFactory", sftpSessionFactory(host, port, user, password));
beanFactory.registerSingleton("sftpInboundFileSynchronizer",
sftpInboundFileSynchronizer((SessionFactory) beanFactory.getBean("sftpSessionFactory")));
}
public SessionFactory<ChannelSftp.LsEntry> sftpSessionFactory(String host, String port, String user, String password) {
DefaultSftpSessionFactory factory = new DefaultSftpSessionFactory(true);
factory.setHost(host);
factory.setPort(Integer.parseInt(port));
factory.setUser(user);
factory.setPassword(password);
factory.setAllowUnknownKeys(true);
return new CachingSessionFactory<>(factory);
}
private SftpInboundFileSynchronizer sftpInboundFileSynchronizer(SessionFactory sessionFactory) {
SftpInboundFileSynchronizer fileSynchronizer = new SftpInboundFileSynchronizer(sessionFactory);
fileSynchronizer.setDeleteRemoteFiles(true);
fileSynchronizer.setPreserveTimestamp(true);
fileSynchronizer.setRemoteDirectory("/mydir/subdir);
fileSynchronizer.setFilter(new SftpSimplePatternFileListFilter("*.pdf"));
return fileSynchronizer;
}
#Bean
#InboundChannelAdapter(channel = "sftpChannel", poller = #Poller(fixedDelay = "2000"))
public MessageSource<File> sftpMessageSource(String s) {
SftpInboundFileSynchronizingMessageSource source = new SftpInboundFileSynchronizingMessageSource(
(AbstractInboundFileSynchronizer<ChannelSftp.LsEntry>) applicationContext.getBean("sftpInboundFileSynchronizer"));
source.setLocalDirectory(new File("/dir/subdir"));
source.setAutoCreateLocalDirectory(true);
source.setLocalFilter(new AcceptOnceFileListFilter<>());
source.setMaxFetchSize(Integer.parseInt(maxFetchSize));
source.setAutoCreateLocalDirectory(true);
return source;
}
#Bean
#ServiceActivator(inputChannel = "sftpChannel")
public MessageHandler handler() {
return message -> {
LOGGER.info("Payload - {}", message.getPayload());
};
}
This code works fine. But If I create sftpMessageSource dynamically, then #InboundChannelAdapter annotation won't work. Please suggest a way to dynamically create sftpMessageSource and handler beans also and add respective annotations.
Update:
Following Code Worked :
#PostConstruct
void init() {
int index = 0;
for (String directory : directories) {
index++;
int finalI = index;
IntegrationFlow flow = IntegrationFlows
.from(Sftp.inboundAdapter(sftpSessionFactory())
.preserveTimestamp(true)
.remoteDirectory(directory)
.autoCreateLocalDirectory(true)
.localDirectory(new File("/" + directory))
.localFilter(new AcceptOnceFileListFilter<>())
.maxFetchSize(10)
.filter(new SftpSimplePatternFileListFilter("*.pdf"))
.deleteRemoteFiles(true),
e -> e.id("sftpInboundAdapter" + finalI)
.autoStartup(true)
.poller(Pollers.fixedDelay(2000)))
.handle(handler())
.get();
this.flowContext.registration(flow).register();
}
}
#Bean
public SessionFactory<ChannelSftp.LsEntry> sftpSessionFactory() {
DefaultSftpSessionFactory factory = new DefaultSftpSessionFactory(true);
factory.setHost(host);
factory.setPort(Integer.parseInt(port));
factory.setUser(user);
factory.setPassword(password);
factory.setAllowUnknownKeys(true);
return new CachingSessionFactory<>(factory);
}
Annotations in Java are static. You can't add them at runtime for created objects. Plus the framework reads those annotation on application context startup. So, what you are looking for is just not possible with Java as language per se.
You need consider to switch to Java DSL in Spring Integration to be able to use its "dynamic flows": https://docs.spring.io/spring-integration/docs/5.3.1.RELEASE/reference/html/dsl.html#java-dsl-runtime-flows.
But, please, first of all study more what Java can do and what cannot.

Spring Integration SFTP fetch daily but process immediately

I want to filter and fetch files daily and then process that all filtered files immediately.
here is my config;
#Bean
public SftpInboundFileSynchronizer sftpInboundFileSynchronizer() {
SftpInboundFileSynchronizer fileSynchronizer = new SftpInboundFileSynchronizer(sftpSessionFactory());
fileSynchronizer.setRemoteDirectory(remoteDirectory);
fileSynchronizer.setFilter(new SftpSimplePatternFileListFilter(downloadFilter));
return fileSynchronizer;
}
#Bean
#InboundChannelAdapter(channel = "sftpChannel", poller = #Poller(cron = "0 0 0 * * ?"))
public MessageSource<File> sftpMessageSource() {
SftpInboundFileSynchronizingMessageSource messageSource = new SftpInboundFileSynchronizingMessageSource(sftpInboundFileSynchronizer());
messageSource.setLocalDirectory(new File(localDirectory));
messageSource.setAutoCreateLocalDirectory(true);
return messageSource;
}
and here is my file handler;
#ServiceActivator(inputChannel = "sftpChannel")
public void handle(File file) {
log.info("file received . {}", file.getName());
}
it fetches the files daily and wait one day to call my handler for each of that fetched files.
I want to consume that fetched files immediately.
is it possible ?
How can I do that?
Increase maxMessagesPerPoll on the poller - it defaults to 1.
Or -1 means infinity (while unprocessed files are still present).

SimpleMessageListenerContainer Amazon SQS Pollinterval

I am using Spring Cloud library to poll SQS. How can I set poll interval?
#Bean
#Primary
public AmazonSQSAsync amazonSQSAsync() {
return AmazonSQSAsyncClientBuilder.standard().
withCredentials(awsCredentialsProvider()).
withClientConfiguration(clientConfiguration()).
build();
}
#Bean
#ConfigurationProperties(prefix = "aws.queue")
public SimpleMessageListenerContainer simpleMessageListenerContainer(AmazonSQSAsync amazonSQSAsync) {
SimpleMessageListenerContainer simpleMessageListenerContainer = new SimpleMessageListenerContainer();
simpleMessageListenerContainer.setAmazonSqs(amazonSQSAsync);
simpleMessageListenerContainer.setMessageHandler(queueMessageHandler());
simpleMessageListenerContainer.setMaxNumberOfMessages(10);
simpleMessageListenerContainer.setTaskExecutor(threadPoolTaskExecutor());
return simpleMessageListenerContainer;
}
#Bean
public QueueMessageHandler queueMessageHandler() {
QueueMessageHandlerFactory queueMessageHandlerFactory = new QueueMessageHandlerFactory();
queueMessageHandlerFactory.setAmazonSqs(amazonSQSAsync());
QueueMessageHandler queueMessageHandler = queueMessageHandlerFactory.createQueueMessageHandler();
return queueMessageHandler;
}
#Bean
public ThreadPoolTaskExecutor threadPoolTaskExecutor() {
ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
executor.setCorePoolSize(10);
executor.setMaxPoolSize(10);
executor.setThreadNamePrefix("oaoQueueExecutor");
executor.initialize();
return executor;
}
Call setWaitTimeOut(N) function of the base class AbstractMessageListenerContainer in package org.springframework.cloud.aws.messaging.listener. N is the long poll timeout in seconds.
For example if you want to wait for 5 sec before it returns, use below line of code in your queueMessageHandler() function. The default is 1 sec if you don't call this function. The maximum long polling timeout is 20 sec, so the max you can give to this function is 20 which means "wait for 20 seconds"
simpleMessageListenerContainer.setWaitTimeOut (5);
The source code is here: https://github.com/spring-cloud/spring-cloud-aws/blob/master/spring-cloud-aws-messaging/src/main/java/org/springframework/cloud/aws/messaging/listener/AbstractMessageListenerContainer.java
/**
* Configures the wait timeout that the poll request will wait for new message to arrive if the are currently no
* messages on the queue. Higher values will reduce poll request to the system significantly.
*
* #param waitTimeOut
* - the wait time out in seconds
*/
public void setWaitTimeOut(Integer waitTimeOut) {
this.waitTimeOut = waitTimeOut;
}

Spring data stream InboundChannelAdapter have different behaviour with different return type

I am using spring cloud stream, and facing an issue, when I am using InboundChannelAdapter with return type MessageSource, then is behaving like a singleton class, it's running in every 1 second and sending the same data to the consumer. Also, the logger is printing them out only one time when application startup.
#InboundChannelAdapter(value = Source.OUTPUT, poller = #Poller(fixedDelay = "1000", maxMessagesPerPoll = "1"))
public MessageSource<String> uuidSource() {
UuidCaller uuidCaller = new UuidCaller(atomicLong.addAndGet(1), new Date(), UUID.randomUUID().toString());
logger.info("buid request:"+uuidCaller);
return () -> MessageBuilder.withPayload(uuidCaller.toString()).build();
}
but where I have changed to MessageSource to simple to String type then its working fine
InboundChannelAdapter(value = Source.OUTPUT, poller = #Poller(fixedDelay = "1000", maxMessagesPerPoll = "1"))
public String uuidSource() {
UuidCaller uuidCaller = new UuidCaller(atomicLong.addAndGet(1), new Date(), UUID.randomUUID().toString());
logger.info("build request:"+uuidCaller);
return uuidCaller.toString();
}
it's sending consumer to updated data and also printing the update log in every second.
So my question is why different behaviour for different return type?
When it's a MessageSource it must also be annotated with '#Bean'. Hence the UUID is only created once. When it's a POJO method it's created on each poll.
If you move the UUID Into the lambda they will work the same.
EDIT
#Bean
#InboundChannelAdapter(value = Source.OUTPUT, poller = #Poller(fixedDelay = "1000", maxMessagesPerPoll = "1"))
public MessageSource<String> uuidSource() {
return () -> {
UuidCaller uuidCaller = new UuidCaller(atomicLong.addAndGet(1), new Date(), UUID.randomUUID().toString());
logger.info("buid request:"+uuidCaller);
return MessageBuilder.withPayload(uuidCaller.toString()).build();
};
}

Spring batch chunk size

I am new in spring batch and I am stuck in something basic I think.
I create a Job configuration like this:
//reader
#Bean
public ItemReader<UnprocessedTrek> atReader() {
//AnalyzeTrekItemReader reader = new AnalyzeTrekItemReader();
JdbcCursorItemReader<UnprocessedTrek> reader = new JdbcCursorItemReader<UnprocessedTrek>();
reader.setSql("SELECT * FROM " + UnprocessedTrek.TBL_NAME);
reader.setRowMapper(new UnprocessedTrekRowMapper());
reader.setDataSource(rntDataSource);
reader.setFetchSize(0);
return reader;
}
//processor
#Bean
public ItemProcessor<UnprocessedTrek, Document> atProcessor()
{
AnalyzeTrekItemProcessor processor = new AnalyzeTrekItemProcessor();
return processor;
}
//writer
#Bean
public ItemWriter<Document> atWriter()
{
AnalyzeTrekItemWriter writer = new AnalyzeTrekItemWriter();
return writer;
}
#Bean
public Step analyzeTrek()
{
return steps.get("analyzeTrek")
.<UnprocessedTrek, Document> chunk(50)
.reader(atReader())
.processor(atProcessor())
.writer(atWriter())
.build();
}
My problem is that when the size of items processed is inferior as 50 the writer is not called. What I am missing in my configuration ?
Thanks for your help.

Resources