Send entire Blob data to web hook endpoint - azure-blob-storage

Am storing all the xml files in azure blob storage. The file data is huge.The Blob events (Created) payload only contains the information about file properties and not the actual data.
Can you please suggest a way to push the entire blob data to a webhook endpoint automatically for each creation of blob.
Is it ideal to send the entire data to webhook as my data is very large ?

Related

How to deal with 6 megabytes limit in AWS lambda services?

I have a three-tier app running in AWS. The midware was written in Python Flask and stored on a Linux machine.
However, they asked me to move to AWS Lambda services. There is a limit of 6 M for the returning data function. As I´m dealing with GEOJson, sometimes it´s necessary to return up to 15 M.
Despite the AWS lambda stuff is stateless, I could provide some way to return data partitioned, but it´s problematic I think it will be necessary to generate the whole map again and again until I could fulfill all data.
Is there a better way to deal with this? I´m programming in Python.
I'd handle this by sending the data to S3, and issuing a redirect or JSON response that points to the URL on S3 (with a temporary, expiring URL if the data should be secure). If the data's long-lived, you can just leave it there; if not, you could use S3's lifecycle rules to have the files automatically delete after 24 hours or so.
If you have control of the client too that receives those data, you can send a compressed result that is then uncompressed client side. So you'll be able to send that 15MB response too, which can become really small when compressed.
Or you can send a fragment of the whole response with a token or something indicating the client that the response is not complete. Than the client will make another request with that token to get the next fragment, and so on until there are no more fragments. At this point the client can join all fragments to get the full response.
Speaking of the 6MB limit, I hope that at some point we will have the ability to set what is the max payload size. since 6MB is fine for most cases, but not ALL cases
You can use presigned S3 URL to upload, using this there will be no bound by payload size.
Get HTTP GET request to API Gateway, then Lambda function get generate presigned URL and return it presigned S3 URL.
Then client directly update content to s3 using pre-signed s3 URL.
https://docs.aws.amazon.com/AmazonS3/latest/dev/ShareObjectPreSignedURL.html

ParseServer: Who to create a temporary file in Cloud Code and send it to browser

I want to generate a file (PDF, txt, csv, ...) in a parse cloud code function and send it directly as download to the browser without saving it on the server.
Is there a possibility to modify the cloud code response (headers, body)?
Or is there an other way to achieve this?

Unable to get data from bulk API call

I am trying to load data in FHIRBASE through a bulk API call.I have used the below command for the same:
fhirbase --host localhost -p 5432 -d fhirbase -U postgres -W postgres --fhir=3.3.0 load -m insert http://localhost:6544/patients
This endpoint 'http://localhost:6544/patients' has json data.
Getting a response:-No Content-Location header was returned by Bulk Data API server.
Thanks for your interest in Fhirbase!
Bulk Data API is not a part of FHIR specification yet. However, there is a draft specification in the working group's GitHub repo: https://github.com/smart-on-fhir/fhir-bulk-data-docs/blob/master/export.md. This page fully describes Bulk Data API requests and responses.
Bulk Data API works asynchronously, which means that a client does not receive the response immediately, as with regular REST endpoints. Instead of that, client initiates (kick-offs) a Bulk Data API request describing data he's interested in. Server responds with 202 Accepted and returns temporary URL in Content-Location header. Client will poll this URL to know if bulk data files are ready or not.
In your case, Fhirbase complains that your Bulk Data endpoint does not return that temporary URL to get current request's status. Without that URL Fhirbase cannot proceed to actual NDJSON files download.

Is permanent storage of video metadata against the YouTube Data API ToS?

The YouTube Data API ToS says:
Your API Client may employ session-based caching solely of YouTube API results, but You must use commercially reasonable efforts to cause Your API Client to update cached results upon any changes in video metadata. For example, if a video is removed from the YouTube service or made "private" by the video uploader, cached results shall be removed from Your cache. For the avoidance of doubt, Your API Client shall not be designed to cache YouTube audiovisual content.
The YouTube Data API overview also says:
Your application can cache API resources and their ETags. Then, when your application requests a stored resource again, it specifies the ETag associated with that resource. If the resource has changed, the API returns the modified resource and the ETag associated with that version of the resource. If the resource has not changed, the API returns an HTTP 304 response (Not Modified), which indicates that the resource has not changed. Your application can reduce latency and bandwidth usage by serving cached resources in this manner.
This does not mean that any time I wish to request a resource I must go back to the YouTube Data API, correct?
The only data from the API I'm interested in storing in my database is
ETag
Video id
Duration
Title
Is it okay for me to store these four items in my database (given I update the information reasonably regularly)?
I'm not interested in storing any portion of the actual video whatsoever. Just the metadata.

Client side checking before mapping to resource/ url mapping

I have created a API which is used to upload file on AWS S3.
I want to restrict the file size to 10MB.
The following is my API.
#POST
#Path("/upload")
#Produces("application/json")
public Response kafkaBulkProducer(InputStream a_fileInputStream) throws IOException {
// UPLOADING LOGIC
}
As much as I understand when a request is made to my API the data/InputStream is loaded on my server.
This is consuming resources (connection etc.).
Is there any way by which I can identity the file size before the URL is mapped or resource mapping is done, so that if my file size is greater than 10MB I will not allow it to reach my server.
I think I can work with pre-filter. But my biggest concern and question is when the API is called, will the stream data will be stored on my server first ?
Do the pre-matching filter will help, so the the data will not be stored on my server in case if its size is greater than 10MB.
Basically I don't want to store data on my server then check the size and then upload to s3.
I want a solution where I can check the file size before loading to server and then I can upload to S3.
This API I will be using with curl.
How can I do this.
I hope I am clear about my question.
Thank you

Resources