Spring Batch: move files to another location - spring

I want to use Spring Batch to perform the upload of files from my server to e.g. Google Drive.
So the steps are:
get files from a specified folder
upload them to Google Drive
update each entry in my DB corresponding to this file (i.e. update path)
My question is: do I necessarily have to do it with tasklet? If so, do I have to split the job into chunks myself and there will be no restart-on-failure support?

Related

Azure Data Factory Unzipping many files into partitions based on filename

I have a large zip file that has 900k json files in it. I need to process these with a data flow. I'd like to organize the files into folders using the last two digits in the file name so I can process them in junks of 10k. My question is how to I setup a pipeline to use part of the file name of the files in the zip file (the source) as part of the path in the sink?
current setup: zipfile.zip -> /json/XXXXXX.json
desired setup: zipfile.zip -> /json/XXX/XXXXXX.json
Please check if below references can help
In source transformation, you can read from a container, folder, or
individual file in Azure Blob storage. Use the Source options tab to
manage how the files are read. Using a wildcard pattern will instruct
the service to loop through each matching folder and file in a single
source transformation. This is an effective way to process multiple
files within a single flow.
[ ] Matches one or more characters in the brackets.
/data/sales/**/*.csv Gets all .csv files under /data/sales
And please go through 1. Copy and transform data in Azure Blob storage - Azure Data Factory
& Azure Synapse | Microsoft Docs for other patterns and to check all filtering
possibilities in azure blob storage.
How to UnZip Multiple Files which are stored on Azure Blob Storage
By using Azure Data Factory - Bing video
In the sink transformation, you can write to either a container or a folder in Azure Blob storage.
File name option: Determines how the destination files are named in the destination folder.

Is there anyway to avoid processing same file twice with Spring Batch?

I am working on the 3 steps Spring Batch project. Firstly, it downloads needed text files from ftp to local, then process it, and finally delete files in the local directory every 10 minutes. And every 10 minutes there are new files loaded in the FTP. What if there emerge some problem in the FTP and it does not load new files? Then Spring Batch project download same file and process it again. So my question is that how can avoid Spring Batch to process same file twice?
Edit: I have used Apache common library to download files from FTP.
And I am using MultiResourceItemReader to pull 2 text files at each run.
I would use the file name as a job parameter. This will create a job instance for each file.
Now since Spring Batch prevents running the same job instance to completion more than once, then each file would be processed only once and you could avoid processing the same file twice by design.

shared drive csv file load to Mssql table using spring

I am searching for approach/ code base which can fulfill the below requirement.
We have source file(formatted) in shared drive which has ~one
million record count, this drive has new file every day with date prefix on it(eg: 02-12-2018_abcd.txt)
2.While reading file from sharedrive location, if its any failure occuer it
should not commit the sql insert.
3.this job should run on schduled time.
I found the couple of approaches to read file from shared drive like jar to read, another approach is to copy the file from shared drive to local machine(on applicaion server) and do spring batch processing and other approach is using spring integration adapter, inbount channel etc.
Please suggest and the best approach and spring code base/ git code for the same. Thanks
This is a typical use case where Spring Batch can help. You can have a first step (of type tasklet) that copies the file from the shared drive to the local machine and then a second step (of type chunk oriented tasklet) that reads the file and inserts data in the database.
You can find samples here: https://github.com/spring-projects/spring-batch/tree/master/spring-batch-samples

Blob files have to renamed manually to include parent folder path

We are new to Windows azure and have used Windows azure storage for blob objects while developing sitefinity application but the blob files which are uploaded to this storage via publishing to azure from Visual Studio uploads files with only the file names and do not maintain the prefix folder name and slash. Hence we have to rename all files manually on the windows azure management portal and put the folder name and slash in the beginning of each file name so that the page which is accessing these images can show the images properly otherwise the images are not shown due to incorrect path.
Though in sitefinity admin panel , when we upload these images/blob files in those pages , we upload them inside a folder and we have configured to leverage sitefinity to use azure storage instead of database.
Please check the file attached to see the screenshot.
Please help me to solve this.
A few things I would like to mention first:
Windows Azure does not support rename functionality. Rename blob functionality = copy blob followed by delete blob.
Copy blob operation is asynchronous so you must wait for copy operation to finish before deleting the blob.
Blob storage does not support folder hierarchy natively. As you may have already discovered, you create an illusion of a folder by prepending a blob name (say logo.png) with the name of folder you want (say images) and separate them with slash (/) so your blob name becomes images/logo.png.
Now coming to your problem. Needless to say that manually renaming the blobs would be a cumbersome exercise. I would recommend using a storage management tool to do that. One such example would be Azure Management Studio from Cerebrata. If you use that tool, essentially what you can do is create an empty folder in the container and then move the files into that folder. That to me would be the fastest way to achieve your objective.
If you wish to write some code to do that, here are the steps you will take:
First you will list all blobs in a blob container.
Next you will loop over this list.
For each blob (let's call it source blob), you would get its name and prepend the folder name that you want and create an instance of a CloudBlockBlob object.
Next you would initiate a copy blob operation on that blob using StartCopyFromBlob on this new blob where source is your source blob.
You would need to wait for the copy operation to finish. Once the copy operation is finished, you can safely delete the source blob.
P.S. I would have written some code but unfortunately I'm stuck with something else. I might write something later on (but please don't hold your breath for that :)).

WP 7 Isolated Storage

In my WP 7 App, i have to store the images and XML file of two types,
1: first type of files are not updated frequently on server so i want to store them Permanently on local storage so that when ever app starts it can access these files from local storage , and when these files are updated on server , also update local storage files.I want these files not to be deleted on application termination.
2: Second type of files are those that i want to save in isolated storage temporarily e.g. app requested a XML file from server , i stored it locally and next time if app requests same file instead of getting it from server get it from local storage , and Delete these files when the application terminates..
How can i do this ?
Thanks
1) Isolated Storage is designed to be used to store data that should remain permanent (until the user uninstalls the app). There's example code of how to write and save a file on MSDN. Therefore, any file you save (temp or not), will be stored until the user uninstalls the app or your app deletes the file.
2) For temporary data, you can use the PhoneApplicationState property. This will automatically delete the files after your app closes. However, there's a size limit (I belive PhoneApplicationService.State has a limit of 4mb).
Alternatively, if the XML file is too big, you can write it to the Isolated Storage. Then, you can handle your page's Closing event and delete the file from Isolated Storage there using the DeleteFile method.

Resources