Laravel AWS S3 Storage image cache - laravel

I have Laravel based web and mobile application that stores images on AWS S3 and I want to add cache support because even small number of app users produce hundreds and sometimes thouthands of GET requests on AWS S3.
To get image from mobile app I use GET request that is being handled by code like this
public function showImage(....) {
...
return Storage::disk('s3')->response("images/".$image->filename);
}
On the next image you can see response headers that I receive. Cache-Control shows no-cache so I assume that mobile app won't cache this image.
How can I add cache support for this request? Should I do it?
I know that Laravel Documentaion suggests caching for Filestorage - should I implement it for S3? Can it help to decrease GET requests count of read files from AWS S3? Where can I find more info about it.

I would suggest to use a temporary URL as described here: https://laravel.com/docs/7.x/filesystem#file-urls
Then use the Cache to store it until it is expired:
$value = Cache::remember('my-cache-key', 3600 * $hours, function () use ($hours, $image) {
$url = Storage::disk('s3')->temporaryUrl(
"images/".$image->filename, now()->addMinutes(60 * $hours + 1)
);
});
Whenever you update the object in S3, do this to delete the cached URL:
Cache::forget('my-cache-key');
... and you will get a new URL for the new object.

You could use a CDN service like CloudFlare and set a cache header to let CloudFlare keep the cache for a certain amount of time.
$s3->putObject(file_get_contents($path), $bucket, $url, S3::ACL_PUBLIC_READ, array(), array('Cache-Control' => 'max-age=31536000, public'));
This way, files would be fetched once by CloudFlare, stored at their servers, and served to users without requesting images from S3 for every single request.
See also:
How can I reduce my data transfer cost? Amazon S3 --> Cloudflare --> Visitor
How to set the Expires and Cache-Control headers for all objects in an AWS S3 bucket with a PHP script

Related

Php laravel Upload file directly to AWS S3 bucket

Can anyone help me how to upload a file into aws S3 bucket using PHP laravel. But the file should directly get uploaded into S3 using pre signed URL.
I will try to answer this question. So, there are two ways to do this:
You send the pre-signed URL to Frontend Client and let them upload the file to S3 directly, and once uploaded they notify your server of the same.
You receive the file directly on the server and upload it to S3, in this case, you won't need any pre-signed URL, as you would have already configured the AWS access inside the project.
Since solution 1 is self-explanatory, I will try to explain the solution 2.
Laravel provides Storage Facade for handling filesystem operations. It follows the philosophy of multiple drivers - Public, Local Disk, Amazon S3, FTP plus option of extending the driver.
Step 1: Configure your .env file with AWS keys, you will need the following values to start using Amazon S3 as the driver:
AWS Key
AWS Secret
AWS Bucket Name
AWS Bucket Region
Step 2: Assuming that you already have the file uploaded to your server. We will now upload the file to S3 from our server.
If you have mentioned s3 as the default disk, following snippet will do the upload for you:
Storage::put('avatars/1', $fileContents);
If you are using multiple disks, you can upload the file by:
Storage::disk('s3')->put('avatars/1', $fileContents);
We are done! Your file is now uploaded to your S3 bucket. Double-check it inside you S3 bucket.
If you wish to learn more about Laravel Storage, click here.
use Storage;
use Config;
$client = Storage::disk('s3')->getDriver()->getAdapter()->getClient();
$bucket = Config::get('filesystems.disks.s3.bucket');
$command = $client->getCommand('PutObject', [
'Bucket' => $bucket,
'Key' => '344772707_360.mp4' // file name in s3 bucket which you want to access
]);
$request = $client->createPresignedRequest($command, '+20 minutes');
// Get the actual presigned-url
return $presignedUrl = (string)$request->getUri();
We can use 'PutObject' to generate a signed-url for uploading files onto S3.
Make sure this package is insalled:
composer require league/flysystem-aws-s3-v3 "^1.0"
Create access credentials on AWS and set these variables in .env file
AWS_ACCESS_KEY_ID=ORJATNRFO7SDSMJESWMW
AWS_SECRET_ACCESS_KEY=xnzuPuatfZu09103/BXorsO4H/xxxxxxxxxx
AWS_DEFAULT_REGION=ap-south-1
AWS_BUCKET=xxxxxxx
AWS_URL=http://xxxxx.s3.ap-south-1.amazonaws.com/
public function uploadToS3(Request $request)
{
$file = $request->file('file');
\Storage::disk('s3')->put(
'path/in/s3/filename.jpg',
file_get_contents($file->getRealPath())
);
}
Create credentials here:

Destroy image from cloudinary

when I delete the image from my graphql server and using uploader.upload.destroy(public_id), it deletes from media library of cloudinary (https://cloudinary.com/console/media_library/folders/%2F)
but image is still available If I access it via cloudinary endpoint (https://res.cloudinary.com/db9rcrnuw/image/upload/v1576054005/47122.png)
I want to destroy those endpoints as well when the image is deleted.
here, screen.basePath means public_Id of the image
const screen = await ctx.prisma
.deleteScreen({
id: args.screenId
})
.$fragment(fragment);
if (scrn.basePath.length === 5) {
console.log(scrn.basePath.length);
cloudinary.uploader.destroy(screen.basePath, function(error, result) {
console.log(result, error);
});
return screen;
}
The short answer is that this is caused by a combination of not using the 'invalidate' parameter in your destroy API call and a difference between which URL format (i.e. with a version number (v123456789), 'v1' or no version number) the resource is accessed using versus what format your account is configured to send for invalidation.
The first thing to do is ensure that all destroy API calls include the 'invalidate' parameter set to 'true' if you'd like CDN invalidation.
Regarding the URL formats;
The 'v1576054005' that is part of delivery URLs is a version number that is essentially the UNIX timestamp of the upload time of the asset. Its main purpose is to always return the latest image and avoid CDN caching (upload API responses return the URL with the latest upload version). A bit more information on this topic can be found in this article - https://support.cloudinary.com/hc/en-us/articles/202520912-What-are-image-versions.
Please note that there are three possible URL formats Cloudinary can send for invalidation at the CDN, and these are outlined here: https://support.cloudinary.com/hc/en-us/articles/360001208732-What-URL-conventions-are-invalidated
Invalidation requests are sent when you delete or overwrite an image using the Media Library UI, or when you use the SDK/API, and also provide the 'invalidate' parameter, set to 'true'.
By default, all accounts send invalidations for the default format of URL which the SDKs produces, which uses no version number for assets in the root of your account, and a 'v1' placeholder for assets in folders (option 1 from the URL above).
If you were accessing the image with the full version component then that isn't sent for invalidation by default and why you are likely getting a cached copy returned.
In your case, the URL that would've been sent for invalidation would be without a version component (as the resource is in the root folder) i.e.
https://res.cloudinary.com/db9rcrnuw/image/upload/47122.png
Depending on how you are building your URLs, i.e., if you're using the SDK helper methods, taking the URL from the url or secure_url fields of the Upload API response (which use the full version number), will determine the format and thus how your account should be configured to invalidate.
I suggest you to email Cloudinary support (support#cloudinary.com) and share a link to this thread as well as some details on how the URLs you're using are generated so that your account can be configured accordingly.

Serve static files in Flask from private AWS S3 bucket

I am developing a Flask app running on Heroku that allows users to upload images. The app has a page displaying the user's images in a table.
For developing purposes, I am saving the uploaded files to Heroku's ephemeral file system, and everything works fine: the images are correctly loaded and displayed (I am using the last method shown here implying the use of send_from_directory()). Now I have moved the storage to S3 and I am trying to adapt the code. I use boto3 to upload the files to the bucket: it works fine. My doubts are related to the download to populate the users' pages with their images.
As explained here, I could set the file as "public-read" and use the URL (I think this is what Flask-S3 does), but I'd rather prefer not to leave free access to the files. So, my solution attempt is to download the file to Heroku's filesystem and serve the image using again the send_from_directory() as follows:
app.py
#app.route('/download/<resource>')
def download_image(resource):
""" resource: name of the file to download"""
s3 = boto3.client('s3',
aws_access_key_id=current_app.config['S3_ACCESS_KEY'],
aws_secret_access_key=current_app.config['S3_SECRET_KEY'])
s3.download_file(current_app.config['S3_BUCKET_NAME'],
resource,
os.path.join('tmp',
resource))
return send_from_directory('tmp', # Heroku's filesystem
resource,
as_attachment=False)
Then, in the template I generate the URL for the image as follows:
...
<img src="{{ url_for('app.download_image',
resource=resource) }}" height="120" width="120">
...
It works, but I don't think this is the proper way for some reasons: among them, I should manage the Heroku's filesystem to avoid using up all the space between dynos restart (I should delete the images from the filesystem).
Which is the best/preferred way, also considering the performance?
Thanks a lot
The preferred way is to simply create a pre-signed URL for the image, and return a redirect to that URL. This keeps the files private in S3, but generates a temporary, time limited, URL that can be used to download the file directly from S3. That will greatly reduce the amount of work happening on your server, as well as the amount of data transfer being consumed by your server. Something like this:
#app.route('/download/<resource>')
def download_image(resource):
""" resource: name of the file to download"""
s3 = boto3.client('s3',
aws_access_key_id=current_app.config['S3_ACCESS_KEY'],
aws_secret_access_key=current_app.config['S3_SECRET_KEY'])
url = s3.generate_presigned_url('get_object', Params = {'Bucket': 'S3_BUCKET_NAME', 'Key': resource}, ExpiresIn = 100)
return redirect(url, code=302)
If you don't like that solution, you should at least look into streaming the file contents from S3 instead of writing it to the file system.

How to control access to files at another server in Laravel

I have a host for my Laravel website and another (non-laravel) for stored files. Direct access to my files are blocked completely by default and I want to control access to them by creating temporary links in my Laravel site. I know how to code, just want to know the idea of how to do it (not details).
From the Laravel docs
Temporary URLs For files stored using the s3 or rackspace driver, you
may create a temporary URL to a given file using the temporaryUrl
method. This methods accepts a path and a DateTime instance specifying
when the URL should expire:
$url = Storage::temporaryUrl(
'file.jpg', now()->addMinutes(5)
);
You could also make your own solution by directing all image request through your own server and making sure the file visibility is set to private.
Here is an example of how a controller could return image from your storage
public function get($path)
{
$file = Storage::disk('s3')->get($path);
// Do your temp link solution here
return response($file, 200)->header('Content-Type', 'image/png');
}
What i am using right now is Flysystem provided in laravel.Laravel Flysystem integration use simple drivers for working with local filesystems, Amazon S3 and other some space provide also. So for this doesn't matter whether is a server is laravel server or not.
Even better, it's very simple in this to switch between server by just changing server configuration in API.
As far as I know we can create temporary Url for s3 and rackspace in this also by calling temporaryUrl method. Caching is already in this.
That's the thing.
If your files are uploaded on an AWS S3 server
then,
use Storage;
$file_path = "4/1563454594.mp4";
if( Storage::disk('s3')->exists($file_path) ) {
// link expiration time
$urlExpires = Carbon::now()->addMinutes(1);
try {
$tempUrl = Storage::disk('s3')->temporaryUrl($file_path, $urlExpires);
} catch ( \Exception $e ) {
// Unable to test temporaryUrl, its giving driver dont support it issue.
return response($e->getMessage());
}
}
Your temporary URL will be generated, After given expiration time (1 minute). It will expire.

How to tell cloudfront to not cache 302 responses from S3 redirects, or, how else to workaround this image caching generation issue

I'm using Imagine via the LIIPImagineBundle for Symfony2 to create cached versions of images stored in S3.
Cached images are stored in an S3 web enabled bucket served by CloudFront. However, the default LIIPImagineBundle implementation of S3 is far too slow for me (checking if the file exists on S3 then creating a URL either to the cached file or to the resolve functionality), so I've worked out my own workflow:
Pass client the cloudfront URL where the cached image should exist
Client requests the image via the cloudfront URL, if it does not exist then the S3 bucket has a redirect rule which 302 redirects the user to an imagine webserver path which generates the cached version of the file and saves it to the appropriate location on S3
The webserve 301 redirects the user back to the cloudfront URL where the image is now stored and the client is served the image.
This is working fine as long as I don't use cloudfront. The problem appears to be that cloudfront is caching the 302 redirect response (even though the http spec states that they shouldn't). Thus, if I use cloudfront, the client is sent in an endless redirect loop back and forth from webserver to cloudfront, and every subsequent request to the file still redirects to the webserver even after the file has been generated.
If I use S3 directly instead of cloudfront there are no issues and this solution is solid.
According to Amazon's documentation S3 redirect rules don't allow me to specify custom headers (to set cache-control headers or the like), and I don't believe that CloudFront allows me to control the caching of redirects (if they do it's well hidden). CloudFront's invalidation options are so limited that I don't think they will work (can only invalidate 3 objects at any time)...I could pass an argument back to cloudfront on the first redirect (from the Imagine webserver) to fix the endless redirect (eg image.jpg?1), but subsequent requests to the same object will still 302 to the webserver then 301 back to cloudfront even though it exists. I feel like there should be an elegant solution to this problem but it's eluding me. Any help would be appreciated!!
I'm solving this same issue by setting the "Default TTL" in CloudFront "Cache Behavior" settings to 0, but still allowing my resized images to be cached by setting the CacheControl MetaData on the S3 file with max-age=12313213.
This way redirects will not be cached (default TTL behavior) but my resized images will be (CacheControl max-age on s3 cache hit).
If you really need to use CloudFront here, the only thing I can think of is that you don’t directly subject the user to the 302, 301 dance. Could you introduce some sort of proxy script / page to front S3 and that whole process? (or does that then defeat the point).
So a cache miss would look like this:
Visitor requests proxy page through Cloudfront.
Proxy page requests image from S3
Proxy page receives 302 from S3, follows this to Imagine web
server
Ideally just return the image from here (while letting it update
S3), or follow 301 back to S3
Proxy page returns image to visitor
Image is cached by Cloudfront
TL;DR: Make use of Lambda#Edge
We face the same problem using LiipImagineBundle.
For development, an NGINX serves the content from the local filesystem and resolves images that are not yet stored using a simple proxy_pass:
location ~ ^/files/cache/media/ {
try_files $uri #public_cache_fallback;
}
location #public_cache_fallback {
rewrite ^/files/cache/media/(.*)$ media/image-filter/$1 break;
proxy_set_header X-Original-Host $http_host;
proxy_set_header X-Original-Scheme $scheme;
proxy_set_header X-Forwarded-For $remote_addr;
proxy_pass http://0.0.0.0:80/$uri;
}
As soon as you want to integrate CloudFront things get more complicated due to caching. While you can easily add S3 (static website, see below) as a distribution, CloudFront itself will not follow the resulting redirects but return them to the client. In the default configuration CloudFront will then cache this redirect and NOT the desired image (see https://stackoverflow.com/a/41293603/6669161 for a workaround with S3).
The best way would be to use a proxy as described here. However, this adds another layer which might be undesirable. Another solution is to use Lambda#Edge functions as (see here). In our case, we use S3 as a normal distribution and make use of the "Origin Response"-Event (you can edit them in the "Behaviors" tab of your distribution). Our Lambda function just checks if the request to S3 was successful. If it was, we can just forward it. If it was not, we assume that the desired object was not yet created. The lambda function then calls our application that generates the object and stores it in S3. For simplicity, the application replies with a redirect (to CloudFront again), too - so we can just forward that to the client. A drawback is that the client itself will see one redirect. Also make sure to set the cache headers so that CloudFront does not cache the lambda redirect.
Here is an example Lambda Function. This one just redirects the client to the resolve url (which then redirects to CloudFront again). Keep in mind that this will result in more round trips for the client (which is not perfect). However, it will reduce the execution time of your Lambda function. Make sure to add the Base Lambda#Edge policy (related tutorial).
env = {
'Protocol': 'http',
'HostName': 'localhost:8000',
'HttpErrorCodeReturnedEquals': '404',
'HttpRedirectCode': '307',
'KeyPrefixEquals': '/cache/media/',
'ReplaceKeyPrefixWith': '/media/resolve-image-filter/'
}
def lambda_handler(event, context):
response = event['Records'][0]['cf']['response']
if int(response['status']) == int(env['HttpErrorCodeReturnedEquals']):
request = event['Records'][0]['cf']['request']
original_path = request['uri']
if original_path.startswith(env['KeyPrefixEquals']):
new_path = env['ReplaceKeyPrefixWith'] + original_path[len(env['KeyPrefixEquals']):]
else:
new_path = original_path
location = '{}://{}{}'.format(env['Protocol'], env['HostName'], new_path)
response['status'] = env['HttpRedirectCode']
response['statusDescription'] = 'Resolve Image'
response['headers']['location'] = [{
'key': 'Location',
'value': location
}]
response['headers']['cache-control'] = [{
'key': 'Cache-Control',
'value': 'no-cache' # Also make sure that you minimum TTL is set to 0 (for the distribution)
}]
return response
If you just want to use S3 as a cache (without CloudFront). Using static website hosting and a redirect rule will redirect clients to the resolve url in case of missing cache files (you will need to rewrite S3 Cache Resolver urls to the website version though):
<RoutingRules>
<RoutingRule>
<Condition><HttpErrorCodeReturnedEquals>403</HttpErrorCodeReturnedEquals>
<KeyPrefixEquals>cache/media/</KeyPrefixEquals>
</Condition>
<Redirect>
<Protocol>http</Protocol>
<HostName>localhost</HostName>
<ReplaceKeyPrefixWith>media/image-filter/</ReplaceKeyPrefixWith>
<HttpRedirectCode>307</HttpRedirectCode>
</Redirect>
</RoutingRule>
</RoutingRules>

Resources