Retrieve latest modified zip files with Spring Integration - spring-boot

I've written an application that downloads files from a Sftp server. What I want to achieve is to download only ZIP files, but to download them only when they've been modified.
I've written a SftpInboundFileSynchronizer and several InboundChannelAdapter. What is weird is that the same file gets downloaded once and again. I know the key is to choose the right filters, but I don't know how to accomplish it.
public static final String SYNCHRONIZER_BEAN_NAME = "synchorinzer-bean-name";
#Bean(SYNCHRONIZER_BEAN_NAME)
public SftpInboundFileSynchronizer synchronizer(
SessionFactory<SftpClient.DirEntry> sf,
PropertiesPersistingMetadataStore ms,
AppProps cfg) {
SftpInboundFileSynchronizer sync = new SftpInboundFileSynchronizer(sf);
sync.setDeleteRemoteFiles(false);
sync.setRemoteDirectory(cfg.getFtpRemoteDirectory());
sync.setPreserveTimestamp(true);
// sync.setFilter(); ????
return sync;
}
public static final String GIPUZKOANA_OUT_CHANNEL_NAME = "GIPUZKOANA_OUT_CHANNEL";
public static final String GIPUZKOANA_SYNCHRONIZER_BEAN_NAME = "GIPUZKOANA_FILE_SYNCHRONIZER_BEAN";
#Bean(GIPUZKOANA_SYNCHRONIZER_BEAN_NAME)
#InboundChannelAdapter(channel = GIPUZKOANA_OUT_CHANNEL_NAME)
public MessageSource<File> gipuzkoanaMessageSource(
#Qualifier(SYNCHRONIZER_BEAN_NAME) SftpInboundFileSynchronizer sync,
AppProps cfg) {
SftpInboundFileSynchronizingMessageSource source = new SftpInboundFileSynchronizingMessageSource(sync);
source.setLocalDirectory(cfg.getGtfsLocalDirSyncGtfs());
source.setAutoCreateLocalDirectory(true);
source.setMaxFetchSize(1);
source.setLoggingEnabled(true);
source.setLocalFilter(files -> Lists.newArrayList(files)
.stream()
.filter(f -> f.getName().equalsIgnoreCase(cfg.getGtfsGipuzkoana()))
.collect(Collectors.toList()));
return source;
}
// ...
I've tried so far new SftpPersistentAcceptOnceFileListFilter(ms, "gtfs_"), new SftpSimplePatternFileListFilter("*.zip")... but with no luck.
How can achieve what I want?
Thanks!

Try to use something like this:
ChainFileListFilter<SftpClient.DirEntry> chainFileListFilter =
new ChainFileListFilter<>()
.addFilters(new SftpSimplePatternFileListFilter("*.zip"),
new SftpPersistentAcceptOnceFileListFilter(ms, "gtfs_"));
sync.setFilter();
This way it will check for file extension first and only then check for its previous state after processing.
Not sure about your "only when they've been modified" since this filter cannot know about such a state. You can try with a LastModifiedFileListFilter modification for SFTP to be sure that the file is old enough to be pulled.

Related

Is there a way to batch upload a collection of InputStreams to Amazon S3 using the Java SDK?

I am aware of the TransferManager and the .uploadFileList() and .uploadFileDirectory() methods, however they accept java.io.File types as arguments. I have a collection of byte array input streams containing jpeg image data. I don't want to create in-memory files to store this data before I upload it either.
So what I need is essentially what the S3 client's PutObjectRequest does but for a collection of InputStream objects. Also, if one upload fails, I want to abort the whole thing and not upload anything, much like how a database transaction will reverse the changes if something goes wrong along the way.
Is this possible with the Java SDK?
Before I share an answer, please consider upgrading...
fyi - TransferManager is deprecated, now supported as TransferManagerBuilder in JAVA AWS SDK, please consider upgrading if TransferManagerBuilder Object suits your needs.
now since you asked about TransferManager, you could either 1) copy the code below and replace the functionality/arguments with your custom in memory handling of the input stream and handle it in your custom function... or; 2) further below is another sample, try to use this as-is...
Github source modify with with inputstream and issue listed here
private def uploadFile(is: InputStream, s3ObjectName: String, metadata: ObjectMetadata) = {
try {
val putObjectRequest = new PutObjectRequest(bucketName, s3ObjectName,
is, metadata)
// TransferManager supports asynchronous uploads and downloads
val upload = transferManager.upload(putObjectRequest)
upload.addProgressListener(ExceptionReporter.wrap(UploadProgressListener(putObjectRequest)))
} catch {
case e: Exception => throw new RuntimeException(e)
}
}
Bonus, Nice custom answer here using sequence input streams
public void combineFiles() {
List<String> files = getFiles();
long totalFileSize = files.stream()
.map(this::getContentLength)
.reduce(0L, (f, s) -> f + s);
try {
try (InputStream partialFile = new SequenceInputStream(getInputStreamEnumeration(files))) {
ObjectMetadata resultFileMetadata = new ObjectMetadata();
resultFileMetadata.setContentLength(totalFileSize);
s3Client.putObject("bucketName", "resultFilePath", partialFile, resultFileMetadata);
}
} catch (IOException e) {
LOG.error("An error occurred while combining files. {}", e);
}
}
private Enumeration<? extends InputStream> getInputStreamEnumeration(List<String> files) {
return new Enumeration<InputStream>() {
private Iterator<String> fileNamesIterator = files.iterator();
#Override
public boolean hasMoreElements() {
return fileNamesIterator.hasNext();
}
#Override
public InputStream nextElement() {
try {
return new FileInputStream(Paths.get(fileNamesIterator.next()).toFile());
} catch (FileNotFoundException e) {
System.err.println(e.getMessage());
throw new RuntimeException(e);
}
}
};
}

How to read and write files in a reactive way using InputStreamand OutputStream

I am trying to read an Excel file in manipulate it or add new data to it and write it back out. I am also trying to do this a complete reactive process using Flux and Mono. The Idea is to return the resulting file or bytearray via a webservice.
My question is how do I get a InputStream and OutputStream in a non blocking way?
I am using the Apache Poi library to read and generate the Excel File.
I currently have a solution based around a mix of Mono.fromCallable() and Blocking code getting the Input Stream.
For example the webservice part is as follows.
#GetMapping(value = API_BASE_PATH + "/download", produces = "application/vnd.ms-excel")
public Mono<ByteArrayResource> download() {
Flux<TimeKeepingEntry> createExcel = excelExport.createDocument(false);
return createExcel.then(Mono.fromCallable(() -> {
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
excelExport.getWb().write(outputStream);
return new ByteArrayResource(outputStream.toByteArray());
}).subscribeOn(Schedulers.elastic()));
}
And the Processing of the file:
public Flux<TimeKeepingEntry> createDocument(boolean all) {
Flux<TimeKeepingEntry> entries = null;
try {
InputStream inputStream = new ClassPathResource("Timesheet Template.xlsx").getInputStream();
wb = WorkbookFactory.create(inputStream);
Sheet sheet = wb.getSheetAt(0);
log.info("Created document");
if (all) {
//all entries
} else {
entries = service.findByMonth(currentMonthName).log("Excel Export - retrievedMonths").sort(Comparator.comparing(TimeKeepingEntry::getDateOfMonth)).doOnNext(timeKeepingEntry-> {
this.populateEntry(sheet, timeKeepingEntry);
});
}
} catch (IOException e) {
log.error("Error Importing File", e);
}
return entries;
}
This works well enough but not very in line with Flux and Mono. Some guidance here would be good. I would prefer to have the whole sequence non-blocking.
Unfortunately the WorkbookFactory.create() operation is blocking, so you have to perform that operation using imperative code. However fetching each timeKeepingEntry can be done reactively. Your code would looks something like this:
public Flux<TimeKeepingEntry> createDocument() {
return Flux.generate(
this::getWorkbookSheet,
(sheet, sink) -> {
sink.next(getNextTimeKeepingEntryFrom(sheet));
},
this::closeWorkbook);
}
This will keep the workbook in memory, but will fetch each entry on demand when the elements of the Flux are requested.

Spring Integration - Use filename with gateway

I have a problem with spring integration.
I want to make a request on an ftp server to retrieve the name of a file
(at the command line: ls "filename")
But I cannot recover the file name dynamically.
I understood that there was a story with payload or header but I can not
This is what I have:
Review my controller, I use this :
private FtpConfig.MyGateway gateway;
...
gateway.fichierExist(filename);
in my FTP file :
#Bean
public SessionFactory<FTPFile> ftpSessionFactory() {
DefaultFtpSessionFactory sf = new DefaultFtpSessionFactory();
sf.setHost("");
sf.setPort(21);
sf.setUsername("");
sf.setPassword("");
return new CachingSessionFactory<FTPFile>(sf);
}
#Bean
#ServiceActivator(inputChannel = "ftpChannelExist")
public MessageHandler handler2() {
FtpOutboundGateway ftpOutboundGateway =
new FtpOutboundGateway(ftpSessionFactory(), "ls");
ftpOutboundGateway.setOptions("-a -1")
FtpSimplePatternFileListFilter filter = new FtpSimplePatternFileListFilter("filename"); //on filtre sur le nom
return ftpOutboundGateway;
}
#MessagingGateway
public interface MyGateway {
#Gateway(requestChannel = "ftpChannelExist")
ArrayList<String> fichierExist(String filename);
}
I tried with header too, but I can not do anything ...
Thanks.
(Sorry for my english, i'm french)
See LS command description in the Reference Manual:
In addition, filename filtering is provided, in the same manner as the inbound-channel-adapter.
The message payload resulting from an ls operation is a list of file names, or a list of FileInfo objects. These objects provide information such as modified time, permissions etc.
The remote directory that the ls command acted on is provided in the file_remoteDirectory header.
What you are missing in your configuration is a fact of the remote directory to fetch files from. Typically we suggest to have such a directory in the payload as you do with your fichierExist(String filename) and configure the third ctor arg for the FtpOutboundGateway:
FtpOutboundGateway ftpOutboundGateway =
new FtpOutboundGateway(ftpSessionFactory(), "ls", "payload");
According the logic in the FtpOutboundGateway that expression is serving as a source for the remote directory in the LS command. In your case this one is going to be an argument of your fichierExist(String filename) gateway.
You indeed can use there a FtpSimplePatternFileListFilter, but be sure to specify a proper pattern to filter remote files.
In the end the names of the remote files in the requested directory, after filtering are going to be returned to the ArrayList<String> of your gateway. That's correct.
Otherwise your question isn't clear.
Thanks for your reply.
I have change my FtpOutboundGateway for add "payload" but I can't use payload for my FtpSimplePatternFileListFilter.
I've try :
FtpSimplePatternFileListFilter filter = new FtpSimplePatternFileListFilter("filename");
FtpSimplePatternFileListFilter filter = new FtpSimplePatternFileListFilter("payload");
FtpSimplePatternFileListFilter filter = new FtpSimplePatternFileListFilter("payload.filename");
FtpSimplePatternFileListFilter filter = new FtpSimplePatternFileListFilter("payload['filename']");

resource listing

I writing wicket webapp. I want to:
list all resources - videoPreview in the folder
preview it
add link to show in main preview panel.
I read a lot and look examples about resources, but seems like can't understand smthg. I write such funny code:
RepeatingView rv = new RepeatingView("showVideo");
add(rv);
File vidPrevDir = (new File("data/catalog/"+product+"/videoPreview"));
File[] list = vidPrevDir.listFiles();
for (File file : list) {
final String previewFile = file.getName();
AjaxLink link = new AjaxLink(rv.newChildId()){
#Override
public void onClick(AjaxRequestTarget target) {
container.name="iframe";
container.attrs.clear();
container.attrs.put("class", "viewPanel");
container.attrs.put("allowfullscreen", "yes");
container.attrs.put("src", "http://www.youtube.com/embed/"+previewFile.substring(previewFile.indexOf("___"), previewFile.length()-4));
target.add(container);
}
};
rv.add(link);
link.add(new Image("videoPreview", product+"/videoPreview/"+file.getName()));
}
In application i call
getResourceSettings().addResourceFolder("data");
It's work, but i feel bad when i see that. So my question is how to make such things in wicket? Maybe there is resource listing or java.io.File->wicket.Image converter ?
I only found built-in method:
ServletContext context = WicketApplication.get().getServletContext();
Set productList = context.getResourcePaths("/catalog");
It list filenames, not resources, but it is preferable approach, then i use in question.

Is there a way to create multiple instances of CacheManager in Microsoft Enterprise Library, programatically without depending on configuration file

We are trying to migrate to use Microsoft Enterprise Library - Caching block. However, cache manager initialization seems to be pretty tied to the config file entries and our application creates inmemory "containers" on the fly. Is there anyway by which an instance of cache manager can be instantiated on the fly using pre-configured set of values (inmemory only).
Enterprise Library 5 has a fluent configuration which makes it easy to programmatically configure the blocks. For example:
var builder = new ConfigurationSourceBuilder();
builder.ConfigureCaching()
.ForCacheManagerNamed("MyCache")
.WithOptions
.UseAsDefaultCache()
.StoreInIsolatedStorage("MyStore")
.EncryptUsing.SymmetricEncryptionProviderNamed("MySymmetric");
var configSource = new DictionaryConfigurationSource();
builder.UpdateConfigurationWithReplace(configSource);
EnterpriseLibraryContainer.Current
= EnterpriseLibraryContainer.CreateDefaultContainer(configSource);
Unfortunately, it looks like you need to configure the entire block at once so you wouldn't be able to add CacheManagers on the fly. (When I call ConfigureCaching() twice on the same builder an exception is thrown.) You can create a new ConfigurationSource but then you lose your previous configuration. Perhaps there is a way to retrieve the existing configuration, modify it (e.g. add a new CacheManager) and then replace it? I haven't been able to find a way.
Another approach is to use the Caching classes directly.
The following example uses the Caching classes to instantiate two CacheManager instances and stores them in a static Dictionary. No configuration required since it's not using the container. I'm not sure it's a great idea -- it feels a bit wrong to me. It's pretty rudimentary but hopefully helps.
public static Dictionary<string, CacheManager> caches = new Dictionary<string, CacheManager>();
static void Main(string[] args)
{
IBackingStore backingStore = new NullBackingStore();
ICachingInstrumentationProvider instrProv = new CachingInstrumentationProvider("myInstance", false, false,
new NoPrefixNameFormatter());
Cache cache = new Cache(backingStore, instrProv);
BackgroundScheduler bgScheduler = new BackgroundScheduler(new ExpirationTask(null, instrProv), new ScavengerTask(0,
int.MaxValue, new NullCacheOperation(), instrProv), instrProv);
CacheManager cacheManager = new CacheManager(cache, bgScheduler, new ExpirationPollTimer(int.MaxValue));
cacheManager.Add("test1", "value1");
caches.Add("cache1", cacheManager);
cacheManager = new CacheManager(new Cache(backingStore, instrProv), bgScheduler, new ExpirationPollTimer(int.MaxValue));
cacheManager.Add("test2", "value2");
caches.Add("cache2", cacheManager);
Console.WriteLine(caches["cache1"].GetData("test1"));
Console.WriteLine(caches["cache2"].GetData("test2"));
}
public class NullCacheOperation : ICacheOperations
{
public int Count { get { return 0; } }
public Hashtable CurrentCacheState { get { return new System.Collections.Hashtable(); } }
public void RemoveItemFromCache(string key, CacheItemRemovedReason removalReason) {}
}
If expiration and scavenging policies are the same perhaps it might be better to create one CacheManager and then use some intelligent key names to represent the different "containers". E.g. the key name could be in the format "{container name}:{item key}" (assuming that a colon will not appear in a container or key name).
You can using UnityContainer:
IUnityContainer unityContainer = new UnityContainer();
IContainerConfigurator configurator = new UnityContainerConfigurator(unityContainer);
configurator.ConfigureCache("MyCache1");
IContainerConfigurator configurator2 = new UnityContainerConfigurator(unityContainer);
configurator2.ConfigureCache("MyCache2");
// here you can access both MyCache1 and MyCache2:
var cache1 = unityContainer.Resolve<ICacheManager>("MyCache1");
var cache2 = unityContainer.Resolve<ICacheManager>("MyCache2");
And this is an extension class for IContainerConfigurator:
public static void ConfigureCache(this IContainerConfigurator configurator, string configKey)
{
ConfigurationSourceBuilder builder = new ConfigurationSourceBuilder();
DictionaryConfigurationSource configSource = new DictionaryConfigurationSource();
// simple inmemory cache configuration
builder.ConfigureCaching().ForCacheManagerNamed(configKey).WithOptions.StoreInMemory();
builder.UpdateConfigurationWithReplace(configSource);
EnterpriseLibraryContainer.ConfigureContainer(configurator, configSource);
}
Using this you should manage an static IUnityContainer object and can add new cache, as well as reconfigure existing caching setting anywhere you want.

Resources