create and append to blob storage using logic apps

create and append to blob storage using logic apps - azure-blob-storage

I have a logic app which polls for files does some things with them, succeeds or fails then ends. It will run every 5 minutes and poll for a file.
If it finds a file I can create a blob storage with a date time suffix eg LogutcNow('s').txt
I want to append to this file various messages generated from the logic app eg if steps succeed or fail.
Is Blob storage the best way to put a file in my Azure storage account?
Since the name of the blob depends on the date time, how do I append to it?
It may be that the logic app does not write anything to the log file. In that case I want to delete it.
I want to create the blob at the beginning of my logic app then update it. If there are no updates then I want to delete it. In the update action it seems to require me to specify the name of the blob. Since I haven't create the blob yet this is impossible. One thing I also tried was initialising a string variable to the current date and time and putting that variable into the filename.

Suppose your main problem is after you create a blob with dynamic name and could not get the blob name to do other actions. If that you could just set the blob name with dynamic content Path, if it doesn't shown the dynamic content just set the expression body('Create_blob')?['Path'].

Related

spring batch - how to avoid re-loading(writing) data that was loaded in the previous run

I have a basic spring batch app which is trying to load the data from a csv file to mysql. the program does load the file into db during the first run. However when I accidently re-run the job/app again, it had thrown the primary key violation (for the right reasons).
What is the best way to avoid reloading the data that is present on the target system? when the batch job is scheduled, if for any good reason, the source file has not changed since the previous run, I want to see 0 record processed message rather than a primary key violation error. hope it makes sense.
more information:
Thanks. I have probably not understood the answer. Let me explain my requirement in a better way. I have a file contains the data from external data source (say new hire data) with a fixed name of hire.csv. The file should be updated with the delta changes for every run. As there is a possibility of manual error of not removing all loaded rows, some new hires from previous run would also be present on current run. Is there a mechanism available within itemreader or itemprocessor to skip those records that are already present on the target db? I can do "insert into tb where not in (select from tb)" but this run for every row which I dont want to use. Hope it is clear now. thanks again.

However when I accidently re-run the job/app again, it had thrown the primary key violation (for the right reasons). What is the best way to avoid reloading the data that is present on the target system?
The file you are ingesting should be a (identifying) job parameter. This way, when the first run succeeds, the job instance is complete and cannot be run again. This is by design in Spring Batch for this very use case: preventing accidental job execution twice by error.
Edit: Add further options based on comments
If deleting the file is an option, then you can use a job listener or a final step to delete the file after ingesting it. With this option, you need to add a second identifying paramter (since the file name is always hire.csv) to make sure you have a different job instance for each run. This option does not require having a different file name for each run.
If the file can be renamed to hire-${timestamp}.csv and will be unique, then deleting the file after ingesting it and using a single job parameter with the filename is enough
Side note: I have seen people using a business key to identify records in the input file and using an item processor to query the database and filter items that have been already ingested. This works for small datasets but performs poorly with large datasets due to the additional query for each item.

Enumerate all files in a container and copy each file in a foreach

Need to load all .csv files in Azure Blob Container into SQL database.
Tried using a wild card *.* on the filename in the dataset which uses the linked service that connects to the blob and outputting the itemName in the Get Meta Data activity.
When executing in debug a list of filenames is not returned in the Output window. When referencing the parameter with an expression it is stated that the type is String not collection.

For this kind of task, I use the Get Metadata activity and process the results with a For Each activity. Inside the For Each activity, you can have a simple Copy activity to copy the CSV files to SQL tables, or if the work is more complex you can use Data Flow.
Some useful tips:
In the Get Metadata activity, under the DataSet tab > "Field list", select the "Child Items" option:
I recommend adding a "Filter" activity after the Get Metadata to ensure that you are only processing files, and optionally even expected extensions. You do this in the Settings tab like so:
In the For Each activity, on the Settings tab, set the Items based on the output of the Filter activity:
Inside the For Each, at the activity level, you reference the instance by "#item().name":
Here's what one of my production pipelines that implements this pattern looks like:

For you needs, you could get an idea of LookUp Activity.
Lookup activity can retrieve a dataset from any of the Azure Data
Factory-supported data sources. Use it in the following scenario:
Dynamically determine which objects to operate on in a subsequent
activity, instead of hard coding the object name. Some object examples
are files and tables.
For example, my container has 2 csv files in test container:
Configure a blob storage dataset :
Configure Lookup Activity:
Output:

Are you actually using *.*? That may be too vague for the system to interpret. Maybe you can try something a little more descriptive, like this, ???20190101.json, or whatever matches the patters of your data-sets.
I encountered a weird problem a couple weeks ago, whereby I was using the wildcard character to iterate through a bunch of files. I was always starting on row 3 and as luck would have it, some files didn't have a row 3. A handful of files had metadata in row 1, field names in row 2, and no row 3. So, I was getting some weird errors. I changed the line start to be row 2 and everything worked fine after that.
Also, check out the link below.
https://azure.microsoft.com/en-us/updates/data-factory-supports-wildcard-file-filter-for-copy-activity/

Serializing query result

I have a financial system with all its business logic located in the database and i have to code an automated workflow for transactions batch processing, which consists of steps listed below:
A user or an external system inserts some data in a table
Before further processing a snapshot of this data in the form of CSV file with a digital signature has to be made. The CSV snapshot itself and its signature have to be saved in the same input table. Program updates successfully signed rows to make them available for further steps of code
...further steps of code
Obvious trouble is step#2: I don't know, how to assign results of a query as a BLOB, that represents a CSV file, to a variable. It seems like some basic stuff, but I couldn't find it. The CSV format was chosen by users, because it is human-readable. Signing itself can be made with a request to external system, so it's not an issue.
Restrictions:
there is no application server, which could process the data, so i have to do it with plsql
there is no way to save a local file, everything must be done on the fly
I know that normally one would do all the work on the application layer or with some local files, but unfortunately this is not the case.
Any help would be highly appreciated, thanks in advance

I agree with #william-robertson. you just need to create a comma delimited values string (assuming header and data row) and write that to a CLOB. I recommend an "insert" trigger. There are lots of SQL tricks you can do to make that easier). On usage of that CSV string will need to be owned by the part of the application that reads it in and needs to do something with it.

I understand yo stated you need to create a CVS, but see if you could do XML instead. Then you could use DBMS_XMLGEN to generate the necessary snapshot into a database column directly from the query for it.
I do not accept the concept that a CVS is human-readable (actually try it sometime as straight text). What is valid is that Excel displays it in human-readable form. But is should also be able to display the XML as human-readable. Further, if needed the data in it can be directly back-ported into the original columns.
Just a alternate idea.

Parse Cloud Code touch all records in database

I'm wondering is it possible to touch/update all records in some class so they trigger before and after save hooks. I have a lot of records in database and it takes time to update all manually via Parse control panel.

You could write a cloud job which iterates through everything, but it would need to make an actual change to each object or it won't save (because the objects won't be dirty). You're also limited on runtime so you should sort by updated date and run the job repeatedly until nothing is left to do...

Caching JSON data using text files

I'm trying to have a caching system for my project, the idea is to save the last information (as JSON data) for every user in a text file which exists in a special folder for this user, and instead of hitting the database to fetch the whole required info every time this user logged in or updated the page I call a small field in Users table which called Uptodate to decide whether should I update the text file or simply calling the existed one .
I'm tring to avoid using the memory for chaching data, is this is a good approach to do the job ? should I save data as a text file ? how can I add data to the top of existed file ?
thanx

HTML5 localStorage is perfect for what you're trying to do.
localStorage.setItem("json_data", JSONdata);
alert(localStorage.getItem("json_data"));

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

create and append to blob storage using logic apps - azure-blob-storage

Suppose your main problem is after you create a blob with dynamic name and could not get the blob name to do other actions. If that you could just set the blob name with dynamic content Path, if it doesn't shown the dynamic content just set the expression body('Create_blob')?['Path'].

Related

spring batch - how to avoid re-loading(writing) data that was loaded in the previous run

Enumerate all files in a container and copy each file in a foreach

Serializing query result

Parse Cloud Code touch all records in database

Caching JSON data using text files

Categories

Resources