Spring Integration SFTP fetch daily but process immediately - spring-boot

I want to filter and fetch files daily and then process that all filtered files immediately.
here is my config;
#Bean
public SftpInboundFileSynchronizer sftpInboundFileSynchronizer() {
SftpInboundFileSynchronizer fileSynchronizer = new SftpInboundFileSynchronizer(sftpSessionFactory());
fileSynchronizer.setRemoteDirectory(remoteDirectory);
fileSynchronizer.setFilter(new SftpSimplePatternFileListFilter(downloadFilter));
return fileSynchronizer;
}
#Bean
#InboundChannelAdapter(channel = "sftpChannel", poller = #Poller(cron = "0 0 0 * * ?"))
public MessageSource<File> sftpMessageSource() {
SftpInboundFileSynchronizingMessageSource messageSource = new SftpInboundFileSynchronizingMessageSource(sftpInboundFileSynchronizer());
messageSource.setLocalDirectory(new File(localDirectory));
messageSource.setAutoCreateLocalDirectory(true);
return messageSource;
}
and here is my file handler;
#ServiceActivator(inputChannel = "sftpChannel")
public void handle(File file) {
log.info("file received . {}", file.getName());
}
it fetches the files daily and wait one day to call my handler for each of that fetched files.
I want to consume that fetched files immediately.
is it possible ?
How can I do that?

Increase maxMessagesPerPoll on the poller - it defaults to 1.
Or -1 means infinity (while unprocessed files are still present).

Related

Accessing file from SFTP without downloading it to local using Spring Integration

I currently have the following configuration:
#Bean
public SessionFactory<ChannelSftp.LsEntry> sftpSessionFactory() {
DefaultSftpSessionFactory factory = new DefaultSftpSessionFactory(true);
factory.setHost(sftpHost);
factory.setPort(sftpPort);
factory.setUser(sftpUser);
if (null != sftpPrivateKey) {
factory.setPrivateKey(sftpPrivateKey);
factory.setPrivateKeyPassphrase(sftpPrivateKeyPassphrase);
} else {
factory.setPassword(sftpPassword);
}
factory.setAllowUnknownKeys(true);
return new CachingSessionFactory<>(factory);
}
#Bean
public SftpInboundFileSynchronizer sftpInboundFileSynchronizer() {
SftpInboundFileSynchronizer fileSynchronizer = new SftpInboundFileSynchronizer(sftpSessionFactory());
// fileSynchronizer.setDeleteRemoteFiles(true);
fileSynchronizer.setRemoteDirectory(sftpRemoteDirectory);
fileSynchronizer
.setFilter(new SftpSimplePatternFileListFilter(sftpRemoteDirectoryFilter));
return fileSynchronizer;
}
#Bean
#InboundChannelAdapter(channel = "fromSftpChannel", poller = #Poller(cron = "0/5 * * * * *"))
public MessageSource<File> sftpMessageSource() {
SftpInboundFileSynchronizingMessageSource source = new SftpInboundFileSynchronizingMessageSource(
sftpInboundFileSynchronizer());
source.setLocalDirectory(new File(sftpLocalDirectory));
source.setAutoCreateLocalDirectory(true);
source.setLocalFilter(new AcceptOnceFileListFilter<>());
return source;
}
#Bean
#ServiceActivator(inputChannel = "fromSftpChannel")
public MessageHandler resultFileHandler() {
return message -> System.err.println(message.getPayload());
}
This one downloads anything from the remote directory to a local directory. But I have a rest controller and I would like to stream back a byte array of the file from the SFTP server instead of downloading it to a local machine. Is it possible in Spring Integration/Boot? Do you have some code examples, please?
Since you say that you have a REST controller to make requests for SFTP files, then I would recommend to look into an SftpOutboundGateway, which indeed designed for requests and replies. See its Command.GET and Option.STREAM:
/**
* (-stream) Streaming 'get' (returns InputStream); user must call {#link Session#close()}.
*/
STREAM("-stream"),
See more in docs: https://docs.spring.io/spring-integration/docs/current/reference/html/sftp.html#using-the-get-command
Not sure what led you to SftpInboundFileSynchronizingMessageSource for your request-reply task...
You can use the FtpClient and call retrieveFileStream to read a file from the remote sftp server. See https://commons.apache.org/proper/commons-net/apidocs/org/apache/commons/net/ftp/FTPClient.html#retrieveFileStream-java.lang.String-
I think you can achieve this by using RemoteFileTemplate

Create multiple beans of SftpInboundFileSynchronizingMessageSource dynamically with InboundChannelAdapter

I am using spring inbound channel adapter to poll files from sftp server. Application needs to poll from multiple directories from single sftp server. Since Inbound channel adapter does not allow to poll multiple directories I tried creating multiple beans of same type with different values. Since number of directories can increase in future, I want to control it from application properties and want to register beans dynamically.
My code -
#Override
public void postProcessBeanFactory(ConfigurableListableBeanFactory beanFactory) throws BeansException {
beanFactory.registerSingleton("sftpSessionFactory", sftpSessionFactory(host, port, user, password));
beanFactory.registerSingleton("sftpInboundFileSynchronizer",
sftpInboundFileSynchronizer((SessionFactory) beanFactory.getBean("sftpSessionFactory")));
}
public SessionFactory<ChannelSftp.LsEntry> sftpSessionFactory(String host, String port, String user, String password) {
DefaultSftpSessionFactory factory = new DefaultSftpSessionFactory(true);
factory.setHost(host);
factory.setPort(Integer.parseInt(port));
factory.setUser(user);
factory.setPassword(password);
factory.setAllowUnknownKeys(true);
return new CachingSessionFactory<>(factory);
}
private SftpInboundFileSynchronizer sftpInboundFileSynchronizer(SessionFactory sessionFactory) {
SftpInboundFileSynchronizer fileSynchronizer = new SftpInboundFileSynchronizer(sessionFactory);
fileSynchronizer.setDeleteRemoteFiles(true);
fileSynchronizer.setPreserveTimestamp(true);
fileSynchronizer.setRemoteDirectory("/mydir/subdir);
fileSynchronizer.setFilter(new SftpSimplePatternFileListFilter("*.pdf"));
return fileSynchronizer;
}
#Bean
#InboundChannelAdapter(channel = "sftpChannel", poller = #Poller(fixedDelay = "2000"))
public MessageSource<File> sftpMessageSource(String s) {
SftpInboundFileSynchronizingMessageSource source = new SftpInboundFileSynchronizingMessageSource(
(AbstractInboundFileSynchronizer<ChannelSftp.LsEntry>) applicationContext.getBean("sftpInboundFileSynchronizer"));
source.setLocalDirectory(new File("/dir/subdir"));
source.setAutoCreateLocalDirectory(true);
source.setLocalFilter(new AcceptOnceFileListFilter<>());
source.setMaxFetchSize(Integer.parseInt(maxFetchSize));
source.setAutoCreateLocalDirectory(true);
return source;
}
#Bean
#ServiceActivator(inputChannel = "sftpChannel")
public MessageHandler handler() {
return message -> {
LOGGER.info("Payload - {}", message.getPayload());
};
}
This code works fine. But If I create sftpMessageSource dynamically, then #InboundChannelAdapter annotation won't work. Please suggest a way to dynamically create sftpMessageSource and handler beans also and add respective annotations.
Update:
Following Code Worked :
#PostConstruct
void init() {
int index = 0;
for (String directory : directories) {
index++;
int finalI = index;
IntegrationFlow flow = IntegrationFlows
.from(Sftp.inboundAdapter(sftpSessionFactory())
.preserveTimestamp(true)
.remoteDirectory(directory)
.autoCreateLocalDirectory(true)
.localDirectory(new File("/" + directory))
.localFilter(new AcceptOnceFileListFilter<>())
.maxFetchSize(10)
.filter(new SftpSimplePatternFileListFilter("*.pdf"))
.deleteRemoteFiles(true),
e -> e.id("sftpInboundAdapter" + finalI)
.autoStartup(true)
.poller(Pollers.fixedDelay(2000)))
.handle(handler())
.get();
this.flowContext.registration(flow).register();
}
}
#Bean
public SessionFactory<ChannelSftp.LsEntry> sftpSessionFactory() {
DefaultSftpSessionFactory factory = new DefaultSftpSessionFactory(true);
factory.setHost(host);
factory.setPort(Integer.parseInt(port));
factory.setUser(user);
factory.setPassword(password);
factory.setAllowUnknownKeys(true);
return new CachingSessionFactory<>(factory);
}
Annotations in Java are static. You can't add them at runtime for created objects. Plus the framework reads those annotation on application context startup. So, what you are looking for is just not possible with Java as language per se.
You need consider to switch to Java DSL in Spring Integration to be able to use its "dynamic flows": https://docs.spring.io/spring-integration/docs/5.3.1.RELEASE/reference/html/dsl.html#java-dsl-runtime-flows.
But, please, first of all study more what Java can do and what cannot.

InboundChannelAdapter picks up the same file several times from s3

I have InboundChannelAdapter configured with S3StreamingMessageSource.
I forced Poller to use taskExecutor with only 1 thread. But I see the same file is being picked up by the same thread 3 times with 3-4 seconds interval. Even though poller interval is 10 seconds. I've specified Composite filter which consists of pattern filter and acceptoncefilter. But no result, file is always picked up 3 times.
String prefix = "some_prefix";
String channel = "some_channel"
Pattern filePattern = Pattern.compile(
"^" + prefix + "some_file_name_pattern");
#Bean
#InboundChannelAdapter(value = channel,
poller = #Poller(fixedDelay = "10000", taskExecutor = "threadPoolTaskExecutor"))
public MessageSource<InputStream> createS3InboundStreamingMessageSource() {
S3StreamingMessageSource messageSource = new S3StreamingMessageSource(template());
messageSource.setRemoteDirectory(bucketName);
CompositeFileListFilter<S3ObjectSummary> compositeFileListFilter = new ChainFileListFilter<>();
compositeFileListFilter.addFilter(new S3PersistentAcceptOnceFileListFilter(
new SimpleMetadataStore(), prefix));
compositeFileListFilter.addFilter(new S3RegexPatternFileListFilter(filePattern));
messageSource.setFilter(compositeFileListFilter);
return messageSource;
}
#Transformer(inputChannel = channel,"another_channel")
public Message<S3ObjectInputStream> enrich(Message<S3ObjectInputStream> s3ObjectInputStreamMessage) {
S3ObjectInputStream s3ObjectInputStream = s3ObjectInputStreamMessage.getPayload();
URI zipUri = s3ObjectInputStream.getHttpRequest().getURI();
LOGGER.info("Picking up file : {}", zipUri.getPath());
...
}
private S3RemoteFileTemplate template() {
S3SessionFactory sessionFactory = new S3SessionFactory(amazonS3);
return new S3RemoteFileTemplate(sessionFactory);
}
#Bean
public TaskExecutor threadPoolTaskExecutor() {
ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
executor.setMaxPoolSize(1);
executor.setThreadNamePrefix("single_thread_task_executor");
executor.initialize();
return executor;
}
I see that the app comes to #Transformer 3 times. Would really appreciate any help.

Spring Integration: how to access the returned values from last Subscriber

I'm trying to implement a SFTP File Upload of 2 Files which has to happen in a certain order - first a pdf file and after successfull upload of that an text file with meta information about the pdf.
I followed the advice in this thread, but can't get it to work properly.
My Spring Boot Configuration:
#Bean
public SessionFactory<LsEntry> sftpSessionFactory() {
final DefaultSftpSessionFactory factory = new DefaultSftpSessionFactory(true);
final Properties jschProps = new Properties();
jschProps.put("StrictHostKeyChecking", "no");
jschProps.put("PreferredAuthentications", "publickey,password");
factory.setSessionConfig(jschProps);
factory.setHost(sftpHost);
factory.setPort(sftpPort);
factory.setUser(sftpUser);
if (sftpPrivateKey != null) {
factory.setPrivateKey(sftpPrivateKey);
factory.setPrivateKeyPassphrase(sftpPrivateKeyPassphrase);
} else {
factory.setPassword(sftpPasword);
}
factory.setAllowUnknownKeys(true);
return new CachingSessionFactory<>(factory);
}
#Bean
#BridgeTo
public MessageChannel toSftpChannel() {
return new PublishSubscribeChannel();
}
#Bean
#ServiceActivator(inputChannel = "toSftpChannel")
#Order(0)
public MessageHandler handler() {
final SftpMessageHandler handler = new SftpMessageHandler(sftpSessionFactory());
handler.setRemoteDirectoryExpression(new LiteralExpression(sftpRemoteDirectory));
handler.setFileNameGenerator(message -> {
if (message.getPayload() instanceof byte[]) {
return (String) message.getHeaders().get("filename");
} else {
throw new IllegalArgumentException("File expected as payload.");
}
});
return handler;
}
#ServiceActivator(inputChannel = "toSftpChannel")
#Order(1)
public String transferComplete(#Payload byte[] file, #Header("filename") String filename) {
return "The SFTP transfer complete for file: " + filename;
}
#MessagingGateway
public interface UploadGateway {
#Gateway(requestChannel = "toSftpChannel")
String upload(#Payload byte[] file, #Header("filename") String filename);
}
My Test Case:
final String pdfStatus = uploadGateway.upload(content, documentName);
log.info("Upload of {} completed, {}.", documentName, pdfStatus);
From the return of the Gateway upload call i expect to get the String confirming the upload e.g. "The SFTP transfer complete for file:..." but I get the the returned content of the uploaded File in byte[]:
Upload of 123456789.1.pdf completed, 37,80,68,70,45,49,46,54,13,37,-30,-29,-49,-45,13,10,50,55,53,32,48,32,111,98,106,13,60,60,47,76,105,110,101,97,114,105,122,101,100,32,49,47,76,32,50,53,52,55,49,48,47,79,32,50,55,55,47,69,32,49,49,49,55,55,55,47,78,32,49,47,84,32,50,53,52,51,53,57,47,72,32,91,32,49,49,57,55,32,53,51,55,93,62,62,13,101,110,100,111,98,106,13,32,32,32,32,32,32,32,32,32,32,32,32,13,10,52,55,49,32,48,32,111,98,106,13,60,60,47,68,101,99,111,100,101,80,97,114,109,115,60,60,47,67,111,108,117,109,110,115,32,53,47,80,114,101,100,105,99,116,111,114,32,49,50,62,62,47,70,105,108,116,101,114,47,70,108,97,116,101,68,101,99,111,100,101,47,73,68,91,60,57,66,53,49,56,54,69,70,53,66,56,66,49,50,52,49,65,56,50,49,55,50,54,56,65,65,54,52,65,57,70,54,62,60,68,52,50,68,51,55,54,53,54,65,67,48,55,54,52,65,65,53,52,66,52,57,51,50,56,52,56,68,66 etc.
What am I missing?
I think #Order(0) doesn't work together with the #Bean.
To fix it you should do this in that bean definition istead:
final SftpMessageHandler handler = new SftpMessageHandler(sftpSessionFactory());
handler.setOrder(0);
See Reference Manual for more info:
When using these annotations on consumer #Bean definitions, if the bean definition returns an appropriate MessageHandler (depending on the annotation type), attributes such as outputChannel, requiresReply etc, must be set on the MessageHandler #Bean definition itself.
In other words: if you can use setter, you have to. We don't process annotations for this case because there is no guarantee what should get a precedence. So, to avoid such a confuse we have left for you only setters choice.
UPDATE
I see your problem and it is here:
#Bean
#BridgeTo
public MessageChannel toSftpChannel() {
return new PublishSubscribeChannel();
}
That is confirmed by the logs:
Adding {bridge:dmsSftpConfig.toSftpChannel.bridgeTo} as a subscriber to the 'toSftpChannel' channel
Channel 'org.springframework.context.support.GenericApplicationContext#b3d0f7.toSftpChannel' has 3 subscriber(s).
started dmsSftpConfig.toSftpChannel.bridgeTo
So, you really have one more subscriber to that toSftpChannel and it is a BridgeHandler with an output to the replyChannel header. And a default order is like private volatile int order = Ordered.LOWEST_PRECEDENCE; this one becomes as a first subscriber and exactly this one returns you that byte[] just because it is a payload of request.
You need to decide if you really need such a bridge. There is no workaround for the #Order though...

Spring batch chunk size

I am new in spring batch and I am stuck in something basic I think.
I create a Job configuration like this:
//reader
#Bean
public ItemReader<UnprocessedTrek> atReader() {
//AnalyzeTrekItemReader reader = new AnalyzeTrekItemReader();
JdbcCursorItemReader<UnprocessedTrek> reader = new JdbcCursorItemReader<UnprocessedTrek>();
reader.setSql("SELECT * FROM " + UnprocessedTrek.TBL_NAME);
reader.setRowMapper(new UnprocessedTrekRowMapper());
reader.setDataSource(rntDataSource);
reader.setFetchSize(0);
return reader;
}
//processor
#Bean
public ItemProcessor<UnprocessedTrek, Document> atProcessor()
{
AnalyzeTrekItemProcessor processor = new AnalyzeTrekItemProcessor();
return processor;
}
//writer
#Bean
public ItemWriter<Document> atWriter()
{
AnalyzeTrekItemWriter writer = new AnalyzeTrekItemWriter();
return writer;
}
#Bean
public Step analyzeTrek()
{
return steps.get("analyzeTrek")
.<UnprocessedTrek, Document> chunk(50)
.reader(atReader())
.processor(atProcessor())
.writer(atWriter())
.build();
}
My problem is that when the size of items processed is inferior as 50 the writer is not called. What I am missing in my configuration ?
Thanks for your help.

Resources