spring-integration-aws dynamic file download - spring

I've a requirement to download a file from S3 based on a message content. In other words, the file to download is previously unknown, I've to search and find it at runtime. S3StreamingMessageSource doesn't seem to be a good fit because:
It relies on polling where as I need to wait for the message.
I can't find any way to create a S3StreamingMessageSource dynamically in the middle of a flow. gateway(IntegrationFlow) looks interesting but what I need is a gateway(Function<Message<?>, IntegrationFlow>) that doesn't exist.
Another candidate is S3MessageHandler but it has no support for listing files which I need for finding the desired file.
I can implement my own message handler using AWS API directly, just wondering if I'm missing something, because this doesn't seem like an unusual requirement. After all, not every app just sits there and keeps polling S3 for new files.

There is S3RemoteFileTemplate with the list() function which you can use in the handle(). Then split() result and call S3MessageHandler for each remote file to download.
Although the last one has functionality to download the whole remote dir.

For anyone coming across this question, this is what I did. The trick is to:
Set filters later, not at construction time. Note that there is no addFilters or getFilters method, so filters can only be set once, and can't be added later. #artem-bilan, this is inconvenient.
Call S3StreamingMessageSource.receive manually.
.handle(String.class, (fileName, h) -> {
if (messageSource instanceof S3StreamingMessageSource) {
S3StreamingMessageSource s3StreamingMessageSource = (S3StreamingMessageSource) messageSource;
ChainFileListFilter<S3ObjectSummary> chainFileListFilter = new ChainFileListFilter<>();
chainFileListFilter.addFilters(
new S3SimplePatternFileListFilter("**/*/*.json.gz"),
new S3PersistentAcceptOnceFileListFilter(metadataStore, ""),
new S3FileListFilter(fileName)
);
s3StreamingMessageSource.setFilter(chainFileListFilter);
return s3StreamingMessageSource.receive();
}
log.warn("Expected: {} but got: {}.",
S3StreamingMessageSource.class.getName(), messageSource.getClass().getName());
return messageSource.receive();
}, spec -> spec
.requiresReply(false) // in case all messages got filtered out
)

Related

Spring integration SFTP - issue with filters and number of messages emits

I started using spring integration SFTP and I have some questions.
Filters not working. I have example configuration:
Sftp.inboundAdapter(ftpFileSessionFactory())
.preserveTimestamp(true)
.deleteRemoteFiles(false)
.remoteDirectory(integrationProperties.getRemoteDirectory())
.filter(sftpFileListFilter()) // doesn't work
.patternFilter("*.xlsx") // doesn't work
And my ChainFileListFilter:
private ChainFileListFilter<ChannelSftp.LsEntry> sftpFileListFilter() {
ChainFileListFilter<ChannelSftp.LsEntry> chainFileListFilter = new ChainFileListFilter<>();
chainFileListFilter.addFilter(new SftpPersistentAcceptOnceFileListFilter(metadataStore(), "INT"));
chainFileListFilter.addFilter(new SftpSimplePatternFileListFilter("*.xlsx"));
return chainFileListFilter;
}
If I understand correctly, only the XLSX file should be saved in the local directory. If yes it doesn't work with this configuration. Am I doing something wrong or misunderstood this?
How I can configure SFTP that each downloaded file emit message? I see in the doc two params max-messages-per-poll and max-fetch-size, but I don't know how to set it up so that every file emits a message. I would like to sync files once every 24 hours and produce batch job queue. Maybe there is a workaround?
Is there built-in filter which allow me fetch only files with changed content? The best solution would be to check the checksums of the files.
I will be grateful for your help and explanations.
You cannot combine filter() and patternFilter(). Only one of them can be used: the last one overrides whatever you used before. In other words: or filter() or patternFilter() - not both. By default the logic is like this:
public SftpInboundChannelAdapterSpec patternFilter(String pattern) {
return filter(composeFilters(new SftpSimplePatternFileListFilter(pattern)));
}
private CompositeFileListFilter<ChannelSftp.LsEntry> composeFilters(FileListFilter<ChannelSftp.LsEntry>
fileListFilter) {
CompositeFileListFilter<ChannelSftp.LsEntry> compositeFileListFilter = new CompositeFileListFilter<>();
compositeFileListFilter.addFilters(fileListFilter,
new SftpPersistentAcceptOnceFileListFilter(new SimpleMetadataStore(), "sftpMessageSource"));
return compositeFileListFilter;
}
So, technically you don't need your custom one, if you don't use external persistent MetadataStore. But if you do, think about flipping SftpSimplePatternFileListFilter with SftpPersistentAcceptOnceFileListFilter. Since it is better to check for the pattern before storing the file into MetadataStore.
It is the fact that every synched remote file, passed those filters, is stored into local dir and the message for that local file is emitted immediately when the poller does a request.
The maxFetchSize plays the role when we load remote files into a local dir. The maxMessagesPerPoll is used from the poller, but those are already built from the local files. The message is emitted per local file, not as a batch for all of them. That's not what messaging is designed for.
Please, share more info what does not work with files. The SftpPersistentAcceptOnceFileListFilter checks not only file name, but also mtime of the file. So, that it not about any checksum, but more last modified timestamp of the file.

How to retrieve the XPC service of a file provider extension on macOS?

I have extended my example project from my previous question with an attempt to establish an XPC connection.
In a different project we have successfully implemented the file provider for iOS. The exposed service must be resolved by URLs it is responsible for. On iOS it is the only possibility and on macOS it appears like that, too. Because on macOS the system takes care of managing files there are no URLs except the one which can be resolved through NSFileProviderItemIdentifier.rootContainer.
In the AppDelegate.didFinishLaunching() method I try to retrieve the service like this (see linked code for full reference, I do not want to unnecessarily bloat this question page for now):
let fileManager = FileManager.default
let fileProviderManager = NSFileProviderManager(for: domain)!
fileProviderManager.getUserVisibleURL(for: NSFileProviderItemIdentifier.rootContainer) { url, error in
// [...]
fileManager.getFileProviderServicesForItem(at: url) { list, error in
// list always contains 0 items!
}
}
The delivered list always is empty. However the extension is creating a service source on initialization which creates an NSXPCListener which has an NSXPCListenerDelegate that exports the NSFileProviderReplicatedExtension object on new connections. What am I missing?
func listener(_ listener: NSXPCListener, shouldAcceptNewConnection newConnection: NSXPCConnection) -> Bool {
os_log("XPC listener delegate should accept new connection...")
newConnection.exportedObject = fileProviderExtension
newConnection.exportedInterface = NSXPCInterface(with: SomeProviderServiceInterface.self)
newConnection.remoteObjectInterface = NSXPCInterface(with: SomeProductServiceInterface.self)
newConnection.resume()
return true
}
Suspicious: serviceName of the FileProviderServiceSource never is queried. We are out of ideas why this is not working.
There is a protocol which your extension's principal class can implement, NSFileProviderServicing.
https://developer.apple.com/documentation/fileprovider/nsfileproviderservicing

How to rename all symbols using Roslyn?

I'm building a Standalone app which will load a folder with c# code, and allow the user to write Regex to select and rename namespace/types/fields/properties/methods/argument/variable/events'name, but i'm stuck at renaming source code.
I have analyzed the SyntaxTree and collected all items, and also searched/matched/renamed with regex.
I have done a plenty of codes trying to get roslyn rename "items", but i only the first "item" is renamed while all the next ones are discarded.
I am aware of Immutability of the Syntax API, and after calling Renamer I save the solution, and also re-search the document in the new solution in the next loop.
//renaming code
var newSolution = await Renamer.RenameSymbolAsync(solution, isymbol, newName, solution.Workspace.Options).ConfigureAwait(false);
this.solution = newSolution;
//re-search code
if (solution.Projects.First ().ContainsDocument(doc.Document.Id)) {
var document = project.GetDocument(doc.Document.Id);
...
}
At the end i call SyntaxTree.GetRoot().ToString (); to get the final edited code, which as mentioned above has only the first edit.
Could anyone explain me the correct way to do this or provide me a sample how this could be implemented so i can try on my own?

ExecutionScript output two different flowfiles NIFI

I'm using executionScript with python and I'm having a dataset which it may have some corrupted data, my idea is to process the good data, and put it in my flowfile content to my success relationship and the corrupted one redirect them in the failure relationship, I have done something like this :
for msg in messages :
try :
id = msg['id']
timestamp = msg['time']
value_encoded = msg['data']
hexFrameType = '0x'+value_encoded[0:2]
matches = re.match(regex,value_encoded)
....
except:
error_catched.append(msg)
pass
any idea how can I do that ?
For the purposes of this answer I am assuming you have an incoming flow file called "flowFile" which you obtained from session.get(). If you simply want to inspect the contents of flowFile and then route it to success or failure based on an error occurring, then in your success path you can use:
session.transfer(flowFile, REL_SUCCESS)
And in your error path you can do:
session.transfer(flowFile, REL_FAILURE)
If instead you want new files (perhaps one containing a single "msg" in your loop above) you can use:
outputFlowFile = session.create(flowFile)
to create a new flow file using the input flow file as a parent. If you want to write to the new flow file, you can use the PyStreamCallback technique described in my blog post.
If you create a new flow file, be sure to transfer the latest version of it to REL_SUCCESS or REL_FAILURE using the session.transfer() calls described above (but with outputFlowFile rather than flowFile). Also you'll need to remove your incoming flow file (since you have created child flow files from it and transferred those instead). For this you can use:
session.remove(flowFile)

Getting the filename/path from MvvmCross Plugins.DownloadCache

I'm currently using MvvmCross DownloadCache -- and it's working alright -- especially nice when I just need to drop in an Image URL and it automagically downloads / caches the image and serves up a UIImage.
I was hoping to leverage the code for one other use case -- which is I'd like to grab source images from URL's and cache the files on the local file system, but what I really want for this other use case is the image path on the local file system instead of the UIImage itself.
What would help me most if I could get an example of how I might accomplish that. Is it possible to make that happen in a PCL, or does it need to go into the platform specific code?
Thanks -- that works, but just in case anyone else is following along, I wanted to document how I got the Mvx.Resolve<IMvxFileDownloadCache>() to work. In my setup.cs (in the touch project), I had:
protected override void InitializeLastChance ()
{
Cirrious.MvvmCross.Plugins.DownloadCache.PluginLoader.Instance.EnsureLoaded();
Cirrious.MvvmCross.Plugins.File.PluginLoader.Instance.EnsureLoaded();
Cirrious.MvvmCross.Plugins.Json.PluginLoader.Instance.EnsureLoaded();
...
}
But that wasn't enough, because nothing actually registers IMvxFileDownloadCache inside the DownloadCache plugin (which I was expecting, but it's just not the case).
So then I tried adding this line here:
Mvx.LazyConstructAndRegisterSingleton<IMvxFileDownloadCache, MvxFileDownloadCache>();
But that failed because MvxFileDownloadCache constructor takes a few arguments. So I ended up with this:
protected override void InitializeLastChance ()
{
...
var configuration = MvxDownloadCacheConfiguration.Default;
var fileDownloadCache = new MvxFileDownloadCache(
configuration.CacheName,
configuration.CacheFolderPath,
configuration.MaxFiles,
configuration.MaxFileAge);
Mvx.RegisterSingleton<IMvxFileDownloadCache>(fileDownloadCache);
...
}
And the resolve works okay now.
Question:
I do wonder what happens if two MvxFileDownloadCache objects that are configured in exactly the same way will cause issues by stepping on each other. I could avoid that question by changing the cache name on the one I'm constructing by hand, but I do want it to be a single cache (the assets will be the same).
If you look at the source for the plugin, you'll find https://github.com/MvvmCross/MvvmCross/blob/3.2/Plugins/Cirrious/DownloadCache/Cirrious.MvvmCross.Plugins.DownloadCache/IMvxFileDownloadCache.cs - that will give you a local file path for a cached file:
public interface IMvxFileDownloadCache
{
void RequestLocalFilePath(string httpSource, Action<string> success, Action<Exception> error);
}
You can get hold of a service implementing this interface using Mvx.Resolve<IMvxFileDownloadCache>()
To then convert that into a system-wide file path, try NativePath in https://github.com/MvvmCross/MvvmCross/blob/3.2/Plugins/Cirrious/File/Cirrious.MvvmCross.Plugins.File/IMvxFileStore.cs#L27

Resources