We are writing a Batch Job which takes a file as input from an FTP, generates some new files and writes them to an S3 bucket, and for this we are using Spring Integration.
The file in the FTP is an extraction from a DB and is updated each night.
The problem is that, when we start the app the first time, it connects well to the FTP, downloads the file, and uploads the generation result S3. Then we delete the downloaded file locally and wait to the next generation of the file in the FTP to restart the process. But it never downloads the file again.
Any idea?
#Bean
public IntegrationFlow ftpInboundFlow() {
return IntegrationFlows
.from(ftpReader(),
spec -> spec.id("ftpInboundAdapter")
.autoStartup(true)
.poller(Pollers.fixedDelay(period)))
.enrichHeaders(Map.of("CORRELATION_ID", "rcm"))
.aggregate(aggregatorSpec -> aggregatorSpec
.correlationStrategy(message -> message.getHeaders().get("CORRELATION_ID"))
.releaseStrategy(group -> group.getMessages().size() == 2))
.transform(stockUnmarshaller)
.transform(stockTransformer)
.transform(stockMarshaller)
.transform(picturesDownloader)
.transform(picturesZipper)
.transform(stockIndexer)
.handle(directoryCleaner)
.nullChannel();
}
#Bean
public FtpInboundChannelAdapterSpec ftpReader() {
return Ftp.inboundAdapter(ftpSessionFactory())
.preserveTimestamp(true)
.remoteDirectory(rootFolder)
.autoCreateLocalDirectory(true)
.localDirectory(new File(localDirectory));
}
#Bean
public SessionFactory<FTPFile> ftpSessionFactory() {
DefaultFtpSessionFactory sessionFactory = new DefaultFtpSessionFactory();
sessionFactory.setHost(host);
sessionFactory.setUsername(userName);
sessionFactory.setPassword(password);
sessionFactory.setClientMode(FTPClient.PASSIVE_LOCAL_DATA_CONNECTION_MODE);
return sessionFactory;
}
Thanks in advance.
EDIT:
I use enrichHeaders to ensure that the pipeline is triggered if we have exactly 2 files. Maybe the headers are not removed and the condition will be always greater than 2? Maybe it's the wrong manner to proceed?
Thanks again.
Sounds like you talk about the same file. In this case deleting it from the local dir is not enough. There are some FileListFilter instances involved in the process which hold an entry for the processed file. And according to your configuration you deal with in-memory variants. They really know nothing about your local file removal.
To be precise there are two filters your need worry about: FtpPersistentAcceptOnceFileListFilter for a remote entry and FileSystemPersistentAcceptOnceFileListFilter for local copy of the file. Both of them are implementing ResettableFileListFilter, so, you can call their remove() whenever you done with file process.
The FtpInboundChannelAdapterSpec in Java DSL has these options:
/**
* Configure a {#link FileListFilter} to be applied to the remote files before
* copying them.
* #param filter the filter.
* #return the spec.
*/
public S filter(FileListFilter<F> filter) {
/**
* A {#link FileListFilter} used to determine which files will generate messages
* after they have been synchronized.
* #param localFileListFilter the localFileListFilter.
* #return the spec.
* #see AbstractInboundFileSynchronizingMessageSource#setLocalFilter(FileListFilter)
*/
public S localFilter(FileListFilter<File> localFileListFilter) {
So, you still can have those mentioned filters as default, but you extract them as beans and inject into these options and into your directoryCleaner to perform removal from those filters as well.
There is also an option like:
/**
* Switch the local {#link FileReadingMessageSource} to use its internal
* {#code FileReadingMessageSource.WatchServiceDirectoryScanner}.
* #param useWatchService the {#code boolean} flag to switch to
* {#code FileReadingMessageSource.WatchServiceDirectoryScanner} on {#code true}.
* #since 5.0
*/
public void setUseWatchService(boolean useWatchService) {
And DELETE event is configured for watcher as well. When it happens a removed file is also deleted from the local filter.
You may also deal properly with a remote file when you configure:
/**
* Set to true to enable the preservation of the remote file timestamp when transferring.
* #param preserveTimestamp true to preserve.
* #return the spec.
*/
public S preserveTimestamp(boolean preserveTimestamp) {
This way a newer file with the same name will be treated as a different file and its entry in the mentioned filters will be overwritten. Although I see you use it already, but you still complain that it doesn't work. It might be the case with some old version of Spring Integration when FileSystemPersistentAcceptOnceFileListFilter was not used for local files.
The inbound channel adapter has two filters .filter and .localFilter.
The first filters the remote files before downloading, the second filters files on the file system.
By default the filter is a FtpPersistentAcceptOnceFileListFilter which will only fetch new or changed files.
By default, the localFilter is an FileSystemPersistentAcceptOnceFileListFilter which, again, will only pass a file a second time if it's timestamp has changed.
So the file will only be reprocessed if its timestamp changes.
I suggest you run in a debugger to see why it is not passing the filter.
Related
I started using spring integration SFTP and I have some questions.
Filters not working. I have example configuration:
Sftp.inboundAdapter(ftpFileSessionFactory())
.preserveTimestamp(true)
.deleteRemoteFiles(false)
.remoteDirectory(integrationProperties.getRemoteDirectory())
.filter(sftpFileListFilter()) // doesn't work
.patternFilter("*.xlsx") // doesn't work
And my ChainFileListFilter:
private ChainFileListFilter<ChannelSftp.LsEntry> sftpFileListFilter() {
ChainFileListFilter<ChannelSftp.LsEntry> chainFileListFilter = new ChainFileListFilter<>();
chainFileListFilter.addFilter(new SftpPersistentAcceptOnceFileListFilter(metadataStore(), "INT"));
chainFileListFilter.addFilter(new SftpSimplePatternFileListFilter("*.xlsx"));
return chainFileListFilter;
}
If I understand correctly, only the XLSX file should be saved in the local directory. If yes it doesn't work with this configuration. Am I doing something wrong or misunderstood this?
How I can configure SFTP that each downloaded file emit message? I see in the doc two params max-messages-per-poll and max-fetch-size, but I don't know how to set it up so that every file emits a message. I would like to sync files once every 24 hours and produce batch job queue. Maybe there is a workaround?
Is there built-in filter which allow me fetch only files with changed content? The best solution would be to check the checksums of the files.
I will be grateful for your help and explanations.
You cannot combine filter() and patternFilter(). Only one of them can be used: the last one overrides whatever you used before. In other words: or filter() or patternFilter() - not both. By default the logic is like this:
public SftpInboundChannelAdapterSpec patternFilter(String pattern) {
return filter(composeFilters(new SftpSimplePatternFileListFilter(pattern)));
}
private CompositeFileListFilter<ChannelSftp.LsEntry> composeFilters(FileListFilter<ChannelSftp.LsEntry>
fileListFilter) {
CompositeFileListFilter<ChannelSftp.LsEntry> compositeFileListFilter = new CompositeFileListFilter<>();
compositeFileListFilter.addFilters(fileListFilter,
new SftpPersistentAcceptOnceFileListFilter(new SimpleMetadataStore(), "sftpMessageSource"));
return compositeFileListFilter;
}
So, technically you don't need your custom one, if you don't use external persistent MetadataStore. But if you do, think about flipping SftpSimplePatternFileListFilter with SftpPersistentAcceptOnceFileListFilter. Since it is better to check for the pattern before storing the file into MetadataStore.
It is the fact that every synched remote file, passed those filters, is stored into local dir and the message for that local file is emitted immediately when the poller does a request.
The maxFetchSize plays the role when we load remote files into a local dir. The maxMessagesPerPoll is used from the poller, but those are already built from the local files. The message is emitted per local file, not as a batch for all of them. That's not what messaging is designed for.
Please, share more info what does not work with files. The SftpPersistentAcceptOnceFileListFilter checks not only file name, but also mtime of the file. So, that it not about any checksum, but more last modified timestamp of the file.
Code generation for feign works fine with swagger-codegen-maven-plugin:2.2.2, unfortunatelly I was forced to move to openapi-generator-maven-plugin:2.2.14 or swagger-codegen-maven-plugin:2.2.14. When this generators processed schemas with methods having optional parameters, they duplicate method with one map parameter with annotation #QueryMap(encoded=true).
Example:
/**
* Note, this is equivalent to the other <code>someMethod</code> method,
* but with the query parameters collected into a single Map parameter. This
* is convenient for services with optional query parameters, especially when
* used with the {#link ApiV1CodesGetQueryParams} class that allows for
* building up this map in a fluent style.
* #param queryParams Map of query parameters as name-value pairs
* <p>The following elements may be specified in the query map:</p>
* <ul>
* <li>p1 - param1 (optional)</li>
* <li>p2 - param2 (optional)</li>
* </ul>
*/
#RequestLine("GET /api/v1/someMethod?p1={p1}&p2={p2}")
#Headers({
"Accept: application/json",
})
Response someMethod(#QueryMap(encoded=true) Map<String, Object> queryParams);
Old version of feign lib has no #QueryMap(encoded=true) and so compilation of java code failed. I have no opportunity to upgrade feign lib, so I won't to disable this code generator's feature but can't find any switch for it. Can anybody switch this annoying feature off?
Instead of disabling it with a switch, you can customize the Java Feign generator's template to remove QueryMap and then generate code using the customized templates with the -t CLI option.
I've a requirement to download a file from S3 based on a message content. In other words, the file to download is previously unknown, I've to search and find it at runtime. S3StreamingMessageSource doesn't seem to be a good fit because:
It relies on polling where as I need to wait for the message.
I can't find any way to create a S3StreamingMessageSource dynamically in the middle of a flow. gateway(IntegrationFlow) looks interesting but what I need is a gateway(Function<Message<?>, IntegrationFlow>) that doesn't exist.
Another candidate is S3MessageHandler but it has no support for listing files which I need for finding the desired file.
I can implement my own message handler using AWS API directly, just wondering if I'm missing something, because this doesn't seem like an unusual requirement. After all, not every app just sits there and keeps polling S3 for new files.
There is S3RemoteFileTemplate with the list() function which you can use in the handle(). Then split() result and call S3MessageHandler for each remote file to download.
Although the last one has functionality to download the whole remote dir.
For anyone coming across this question, this is what I did. The trick is to:
Set filters later, not at construction time. Note that there is no addFilters or getFilters method, so filters can only be set once, and can't be added later. #artem-bilan, this is inconvenient.
Call S3StreamingMessageSource.receive manually.
.handle(String.class, (fileName, h) -> {
if (messageSource instanceof S3StreamingMessageSource) {
S3StreamingMessageSource s3StreamingMessageSource = (S3StreamingMessageSource) messageSource;
ChainFileListFilter<S3ObjectSummary> chainFileListFilter = new ChainFileListFilter<>();
chainFileListFilter.addFilters(
new S3SimplePatternFileListFilter("**/*/*.json.gz"),
new S3PersistentAcceptOnceFileListFilter(metadataStore, ""),
new S3FileListFilter(fileName)
);
s3StreamingMessageSource.setFilter(chainFileListFilter);
return s3StreamingMessageSource.receive();
}
log.warn("Expected: {} but got: {}.",
S3StreamingMessageSource.class.getName(), messageSource.getClass().getName());
return messageSource.receive();
}, spec -> spec
.requiresReply(false) // in case all messages got filtered out
)
I'd like to utilize Spring Integration to initiate messages about files that appear in a remote location, without actually transferring them. All I require is the generation of a Message with, say, header values indicating the path to the file and filename.
What's the best way to accomplish this? I've tried stringing together an FTP inbound channel adapter with a service activator to write the header values I need, but this causes the file to be transferred to a local temp directory, and by the time the service activator sees it, the message consists of a java.io.File that refers to the local file and the remote path info is gone. It is possible to transform the message prior to this local transfer occurring?
We have similar problem and we solved it with filters. On inbound-channel-adapter you can set custom filter implementation. So before polling your filter will be called and you will have all informations about files, from which you can decide will that file be downloaded or not, for example;
<int-sftp:inbound-channel-adapter id="test"
session-factory="sftpSessionFactory"
channel="testChannel"
remote-directory="${sftp.remote.dir}"
local-directory="${sftp.local.dir}"
filter="customFilter"
delete-remote-files="false">
<int:poller trigger="pollingTrigger" max-messages-per-poll="${sftp.max.msg}"/>
</int-sftp:inbound-channel-adapter>
<beans:bean id="customFilter" class="your.class.location.SftpRemoteFilter"/>
Filter class is just implementation of the FileListFilter interface. Here it is dummy filter implementation.
public class SftpRemoteFilter implements FileListFilter<LsEntry> {
private static final Logger log = LoggerFactory.getLogger(SftpRemoteFilter.class);
#Override
public final List<LsEntry> filterFiles(LsEntry[] files) {
log.info("Here is files.");
//Do something smart
return Collections.emptyList();
}
}
But if you want to do that as you described, I think it is possible to do it by setting headers on payloads and then using same headers when you are using that payload, but in that case you should use Message<File> instead File in your service activator method.
UPDATE 2: The issue seems to be stemming from the files themselves, and not the contents. I've tried, across multiple repos, duplicating the broken files from the ground up (new file, copy-paste contents, rename, etc) and they work as expected with Doxygen.
UPDATE: It seems that all of the "broken" .h files are being saved as class_.html while working .h files are interface_.html. Strikes me as related.
Trying to set up Doxygen for my Xcode project and for some reason it is ignoring the .h file in one of my repos.
The basic structure of the project is 1 central repo with a handful of private CocoaPods being pulled in either from the local copy or the external repo, depending on which is more recent. Other pods projects, when run against Doxygen, generate documentation just fine. This one does not. I've tried it with a variety of configurations (EXTRACT_ALL, EXTRACT_STATIC, etc. etc.) to no avail.
When run on the following .h file, no documentation is generated and the only thing I see is "The documentation for this class was generated from the following file:", with the .m file following; clicking on that just shows some static string constants and the imports, still no method headers.
One thing I noticed is that if I set EXTRACT_LOCAL_METHODS to YES then it works...but that would imply that I am NOT defining methods in my .h, which is definitely untrue.
Am I missing something?
#import <Foundation/Foundation.h>
#import <FinderAuthApiProtocol.h>
#import <AuthCredentials.h>
//Keys to name persisted objects
extern NSString *const kCarrierAuthCredentialsPersistName;
extern NSString *const kFinderAuthCredentialsPersistName;
extern NSString *const kLastLoggedInUserName;
#interface CommonAuthManager : NSObject
/**
* Returns the singleton object for CommonAuthManager, or creates one if necessary
*
* #return pointer to the singleton instance
*/
+ (CommonAuthManager *)sharedInstance;
/**
* #brief Performs carrier-agnostic authenticaton
*
* This method is called by the UI to perform authentication with a user's
* username and password combination. The carrier-specific implementation of
* Network's FinderAuthApiProtocol determines the precise behavior but as far
* as the manager is concerned it doesn't matter; it just calls auth and waits
* for results
*
* #param userID - the user's ID (i.e. phone number, username, email, etc)
* #param password - the user's password
* #param stayLoggedIn - toggled value for refreshig auth tokens or not
* #param block - block for completion
**/
+ (void)authWithUserID:(NSString *)userID
andPassword:(NSString *)password
andStayLoggedIn:(bool)stayLoggedIn
withCompletionBlock:(void(^)(NSError *error))block;
/**
* Determines if current user is allowed to say logged in (bypass explicit login screen)
*
* YES if all the following criteria are met:
* Current user must exist in persistence store
* Current user last login attempt must have succeeded
* Current user must be allowed to stay logged in
*
* #param error return error
*
* #return Returns YES if user is allowed to stay logged in
*/
+ (BOOL)isUserAllowedToStayLoggedIn:(NSError *__autoreleasing *)error;
/**
* Abstracted-away selector for the LLCommonAuthManager's finder API credentials
**/
+ (AuthCredentials *)finderCredentials;
/**
* Abstracted-away selector for the LLCommonAuthManager's carrier API credentials
*
* NOTE: Functionality across carriers varies. For <XXX> this object will have
* the auth token, while for <YYY> it will be the username and password
* they originally authed with.
**/
+ (AuthCredentials *)credentials;
#end
Turns out Doxygen expects a newline at the bottom of the file.
Who knew.