Automatically trigger the snapshot when file has modified in blob storage - azure-blob-storage

I use Azure blob storage to back data.The back up is done by Azure Data Factory that extract from the CosmosDB. The structure for store the backed up data is: Data Storage Account --> Container --> the JSON files I saved daily. However, the back up is just overwrites and I won't be able to download the previous data.
I realized there is a Create Snapshot button and when I select the JSON files(the Blobs) and clicked the button, I can get the snapshot of the JSON files, how can I generate the Snapshots automatically when the time stamp has changed on those JSON files? In another word once the data factory packed today's data the Snapshots for today will then be created.

You can create an azure function with blob trigger. If the file is modified, the function will be executed, and you should put the snapshot logic in the function.
Here is the sample code(Note: for the first parameter myBlob, change the type to CloudBlockBlob):
public static class Function1
{
[FunctionName("Function1")]
public static void Run([BlobTrigger("images/{name}", Connection = "AzureWebJobsStorage")]CloudBlockBlob myBlob, string name, ILogger log)
{
log.LogInformation($"C# Blob trigger function Processed blob\n Name:{name}");
log.LogInformation("take snapshots for blob: " + name);
myBlob.SnapshotAsync().Wait();
}
}
and here is the settings in the local.settings.json:
{
"IsEncrypted": false,
"Values": {
"AzureWebJobsStorage": "connection string of the azure storage account",
"FUNCTIONS_WORKER_RUNTIME": "dotnet"
}
}

Related

Passing metadata with Azure Event Grid subscription?

I've connected my blob storage account to Event Grid, via an Event Hub subscription, and can see the events from uploaded blobs.
But I was hoping to be able to pass some metadata with each received event, so I can relate the event back to a foreign key (customer identifier) without having to do extra work on each event.
Is this possible? I couldn't see anything in the API docs regarding this.
Based on the Azure Event Grid event schema for Blob storage there is no metadata properties in the Blob storage event data.
Note, there is only one specific case passing some metadata from the AEG Subscription to its subscriber such as a query string of the webhook event handler endpoint (e.g. HttpTrigger function).
Solution for your scenario is using an EventGridTrigger function (subscriber) with output binding to the Event Hub.
The following example shows a lightweight implementation of the event message mediator using the EventGridTrigger function:
[FunctionName("Function1")]
[return: EventHub("%myEventHub%", Connection = "AzureEventHubConnectionString")]
public async Task<JObject> Run([EventGridTrigger]JObject ed, ILogger log)
{
// original event message
log.LogInformation(ed.ToString());
// place for event data enrichment
var metadata = new { metadata = "ABCD", abcd = 12345 };
// enrich data object
ed["data"]["url"]?.Parent.AddAfterSelf(new JProperty("subscription", JObject.FromObject(metadata)));
// show after mediation
log.LogWarning(ed.ToString());
// forward to the Event Hub
return await Task.FromResult(ed);
}
and the log output from the Event Hub:

Microsoft Bot - MemoryStorage - Error - Etag conflict

I am try to save data to the MemoryStorage in Microsoft Bot Frame Work (in .NET environment).
I am using this method for do it:
public static class StateManager
{
private static MemoryStorage _myStorage;
static StateManager()
{
_myStorage = new MemoryStorage();
}
public async static void Save(UserDetails userDetails)
{
var changes = new Dictionary<string, object>();
{
changes.Add("ud", userDetails);
}
await _myStorage.WriteAsync(changes, new CancellationToken());
}
}
until now it's always work fine. but suddenly i am getting this error:
System.Exception: Etag conflict. Original: 4 Current: 5
any idea how to solve this error? thanks!
edit - with solve
I got that the problem was that i push data to the memory twice in a row (without get the data between the tow pushes). it's mean that after i push data one time, i have to get the data from the storage before i push the data again.
My question now it's why? i cannot save data twice without get the data between the tow pushes?
Without more code, I wasn't able to replicate your issue. However, it sounds like you have a concurrency problem.
Your Save() method returns a void. You should instead use:
public async static Task Save(UserDetails userDetails)
Then, when saving, call with:
await StateManager.Save(userDetails).
However, you can save yourself the trouble of these kinds of things and use BotBuilder's built-in state storage. References:
Save User and Conversation Data
Core Bot Sample - This is an example of good user profile storage

Can anyone tell me the Java utility to download documents to your local PC from Content Engine in filenet?

Hello Guys I am trying to write the java utility to download the documents to local PC from content engine in filenet can anyone help me out?
You should read about FileNet P8 CE API, you can start here:
You have to know that the FileNet Content Engine has two types of interface that can be used to connect to it: RMI and SOAP. A cmd line app you are planning to write, can connect only by SOAP (I am not sure that this is true for the newest versions, but what is definitely true, that it is much easier to setup the SOAP connection than EJB), so you have to read that part of the documentation, how to establish a connection in this way to your Content Engine.
On the link above, you can see that first of all you have to collect the required jars for SOAP connection: please check the "Required for a Content Engine Java API CEWS transport client" section for the file names.
After you collect them, you will need a SOAP WSDL URL and a proper user and password, the user has to have read properties and read content right to the documents you would like to download. You also need to know the ObjectStore name and the identifier or the location of your documents.
Now we have to continue using this Setting Up a Thick Client Development Environment link (I opened it from the page above.)
Here you have to scroll down to the "CEWS transport protocol (non-application-server dependent)" section.
Here you can see, that you have to create a jaas.conf file with the following content:
FileNetP8WSI {
com.filenet.api.util.WSILoginModule required;
};
This file must be added as the following JVM argument when you run the class we will create:
java -cp %CREATE_PROPER_CLASSPATH% -Djava.security.auth.login.config=jaas.conf DownloadClient
Now, on the top-right corner of the page, you can see links that describes what to do in order to get a connection, like "Getting Connection", "Retrieving an EntireNetwork Object" etc. I used that snipplet to create the class below for you.
public class DownloadClient {
public static void main(String[] args) throws Exception{
String uri = "http://filenetcehost:9080/wsi/FNCEWS40MTOM";
String userId = "ceadmin";
String password = "password";
String osName = "Test";
UserContext uc = UserContext.get();
try {
//Get the connection and default domain
Connection conn = Factory.Connection.getConnection(uri);
Domain domain = Factory.Domain.getInstance(conn, null);
ObjectStore os = Factory.ObjectStore.fetchInstance(domain, osName, null);
// the last value (jaas samza name) must match with the name of the login module in jaas.conf
Subject subject =UserContext.createSubject(connection, userId, password, "FileNetP8WSI");
// set the subject to the local thread via threadlocal
uc.pushSubject(subject);
// from now, we are connected to FileNet CE, and objectStore "Test"
//https://www.ibm.com/support/knowledgecenter/en/SSNW2F_5.2.0/com.ibm.p8.ce.dev.ce.doc/document_procedures.htm
Document doc = Factory.Document.getInstance(os, ClassNames.DOCUMENT, new Id("{F4DD983C-B845-4255-AC7A-257202B557EC}") );
// because in FileNet a document can have more that one associated content element
// (e.g. stores single page tifs and handle it as a multipaged document), we have to
// get the content elements and iterate list.
ContentElementList docContentList = doc.get_ContentElements();
Iterator iter = docContentList.iterator();
while (iter.hasNext() )
{
ContentTransfer ct = (ContentTransfer) iter.next();
// Print element sequence number and content type of the element.
// Get and print the content of the element.
InputStream stream = ct.accessContentStream();
// now you have an inputstream to the document content, you can save it local file,
// or you can do what you want with it, just do not forget to close the stream at the end.
stream.close();
}
} finally {
uc.popSubject();
}
}
}
This code is just shows how can you implement such a thick client, I have created it now using the documentation, not production code. But after specifying the packages to import, and may handle the exceptions it will probably work.
You have to specify the right URL, user, password and docId of course, and you have to implement the copy from the TransferInputStream to a FileOutputStream, e.g. by using commons.io or java NIO, etc.

aws s3 delete object not working

I'm trying to upload/delete image to/from aws s3 bucket using spring boot.
public class AmazonClient {
private AmazonS3 s3client;
private void initializeAmazon() {
AWSCredentials credentials = new BasicAWSCredentials(this.accessKey, this.secretKey);
this.s3client = AmazonS3ClientBuilder.standard().withRegion(region).withCredentials(new AWSStaticCredentialsProvider(credentials)).build();
}
private void uploadFileTos3bucket(String fileName, File file) {
s3client.putObject(new PutObjectRequest(bucketName, fileName, file)
.withCannedAcl(CannedAccessControlList.PublicRead));
}
public void deleteFileFromS3Bucket(String fileUrl) {
String fileName = fileUrl.substring(fileUrl.lastIndexOf("/") + 1);
s3client.deleteObject(new DeleteObjectRequest(bucketName + "/", fileName));
}
}
The upload function works well, I can see the file has been uploaded to the s3 bucket, but the delete function seems malfunctioning, I get a successful message but the file is still in the bucket.
Thanks in advance if anyone could help me to figure out the problem.
From the javadoc of deleteObject (emphasis mine)
Deletes the specified object in the specified bucket. Once deleted, the object can only be restored if versioning was enabled when the object was deleted.
If attempting to delete an object that does not exist, Amazon S3 will return a success message instead of an error message.
So, most probably the path (fileName) you construct in deleteFileFromS3Bucket does not point to an S3 object.
EDIT
I'm updating my answer based on the comments:
The file name used has special characters (: in the provided example) which gets URL encoded (percent encoded). This encoded URL cannot be used to retrieve or delete the S3 object as the percent in the URL would get encoded again(% gets encoded to %25).
The encoded URL has to be decoded. One way is to use java.net.URLDecoder
URLDecoder.decode(encodedPath, "UTF-8")
public boolean deleteFileFromS3Bucket(String fileUrl) {
String fileName = fileUrl.substring(fileUrl.lastIndexOf("/") + 1);
try {
DeleteObjectsRequest delObjReq = new DeleteObjectsRequest(bucketName).withKeys(fileName);
s3client.deleteObjects(delObjReq);
return true;
} catch (SdkClientException s) {
return false;
}
}
For me, working here is an option.
Just found out that I added an additional slash in new DeleteObjectRequest.
The only thing that worked for me is deleting it through Cyberduck (I neither work for nor am promoting Cyberduck, I genuinely used it and it worked). Here are the steps of what I did:
Download and install Cyberduck.
Click on Open Connection
Select Amazon S3 from the dropdown (default would be FTP)
Enter your access key ID and secret Access key (if you don't have one then you need to create one on your s3 bucket through IAM on AWS).
You will see a list your S3 buckets. Select the file or folder or bucket you want to delete, right click and delete. Even files with 0kb show up here and can be deleted.

WebJob to read from and write to the same Azure blob

I am trying to process images uploaded to azure using webjob. I have 2 containers image and thumbs.
Currently, I am reading from image container, creating a thumbnail and writing it to thumbs container using the following code, which works great.
public static void GenerateThumbnail([QueueTrigger("addthumb")] ImageDTO blobInfo,
[Blob("images/{Name}", FileAccess.Read)] Stream input, [Blob("thumbs/{Name}")] CloudBlockBlob outputBlob)
{
using (Stream output = outputBlob.OpenWrite())
{
ConvertImageToThumbnail(input, output, blobInfo.Name);
outputBlob.Properties.ContentType = GetMimeType(blobInfo.Name);
}
}
Now, I would also like to resize the main image from image container (if it's too big), compress it and replace the original with it.
Is there a way to read from and write to the same blob?
Yes, you can read/write to the same blob. For example you could change your input binding to bind to CloudBlockBlob using FileAccess.ReadWrite:
public static void GenerateThumbnail(
[QueueTrigger("addthumb")] ImageDTO blobInfo,
[Blob("images/{Name}", FileAccess.ReadWrite)] CloudBlockBlob input,
[Blob("thumbs/{Name}")] CloudBlockBlob output)
{
// Process the image
}
You can then access the OpenRead/OpenWrite stream methods on that blob to read the image blob and process/modify it as needed.

Resources