Allow only CloudFront to read from origin servers? - amazon-ec2

I'm using origin servers on CloudFront (as opposed to s3) with signed URLs. I need a way to ensure that requests to my server are coming only from CloudFront. That is, a way to prevent somebody from bypassing CloudFront and requesting a resource directly on my server. How can this be done?

As per the documentation, there's no support for that yet. The only thing I can think of is you can restrict more access, although not entirely by just allowing only Amazon IP addresses to your webserver. They should be able to provide them to you (IP address ranges) as they have provided them to us.
This what the docs say:
Using an HTTP Server for Private Content
You can use signed URLs for any CloudFront distribution, regardless of whether the origin is an Amazon S3 bucket or an HTTP server. However, for CloudFront to access your objects on an HTTP server, the objects must remain publicly accessible. Because the objects are publicly accessible, anyone who has the URL for an object on your HTTP server can access the object without the protection provided by CloudFront signed URLs. If you use signed URLs and your origin is an HTTP server, do not give the URLs for the objects on your HTTP server to your customers or to others outside your organization.

I've just done this for myself, and thought I'd leave the answer here where I started my search.
Here's the few lines you need to put in your .htaccess (assuming you've already turned the rewrite engine on):
RewriteCond %{HTTP_HOST} ^www-origin\.example\.com [NC]
RewriteCond %{HTTP_USER_AGENT} !^Amazon\ CloudFront$ [NC]
RewriteRule ^(.*)$ https://example.com/$1 [R=301,L]
This will redirect all visitors to your Cloudfront distribution - https://example.com in this, um, example - and only let www-origin.example.com work for Amazon CloudFront. If your website code is also on a different URL (a development or staging server, for example) this won't get in the way.
Caution: the user-agent is guessable and spoofable; a more secure way of achieving this would be to set a custom HTTP header in Cloudfront, and check for its value in .htaccess.

I ended up creating 3 Security Groups filled solely with CloudFront IP addresses.
I found the list of IPs on this AWS docs page.
If you want to just copy and paste the IP ranges into the console, you can use this list I created:
Regional:
13.113.196.64/26, 13.113.203.0/24, 52.199.127.192/26, 13.124.199.0/24, 3.35.130.128/25, 52.78.247.128/26, 13.233.177.192/26, 15.207.13.128/25, 15.207.213.128/25, 52.66.194.128/26, 13.228.69.0/24, 52.220.191.0/26, 13.210.67.128/26, 13.54.63.128/26, 99.79.169.0/24, 18.192.142.0/23, 35.158.136.0/24, 52.57.254.0/24, 13.48.32.0/24, 18.200.212.0/23, 52.212.248.0/26, 3.10.17.128/25, 3.11.53.0/24, 52.56.127.0/25, 15.188.184.0/24, 52.47.139.0/24, 18.229.220.192/26, 54.233.255.128/26, 3.231.2.0/25, 3.234.232.224/27, 3.236.169.192/26, 3.236.48.0/23, 34.195.252.0/24, 34.226.14.0/24, 13.59.250.0/26, 18.216.170.128/25, 3.128.93.0/24, 3.134.215.0/24, 52.15.127.128/26, 3.101.158.0/23, 52.52.191.128/26, 34.216.51.0/25, 34.223.12.224/27, 34.223.80.192/26, 35.162.63.192/26, 35.167.191.128/26, 44.227.178.0/24, 44.234.108.128/25, 44.234.90.252/30
Global:
120.52.22.96/27, 205.251.249.0/24, 180.163.57.128/26, 204.246.168.0/22, 205.251.252.0/23, 54.192.0.0/16, 204.246.173.0/24, 54.230.200.0/21, 120.253.240.192/26, 116.129.226.128/26, 130.176.0.0/17, 99.86.0.0/16, 205.251.200.0/21, 223.71.71.128/25, 13.32.0.0/15, 120.253.245.128/26, 13.224.0.0/14, 70.132.0.0/18, 13.249.0.0/16, 205.251.208.0/20, 65.9.128.0/18, 130.176.128.0/18, 58.254.138.0/25, 54.230.208.0/20, 116.129.226.0/25, 52.222.128.0/17, 64.252.128.0/18, 205.251.254.0/24, 54.230.224.0/19, 71.152.0.0/17, 216.137.32.0/19, 204.246.172.0/24, 120.52.39.128/27, 118.193.97.64/26, 223.71.71.96/27, 54.240.128.0/18, 205.251.250.0/23, 180.163.57.0/25, 52.46.0.0/18, 223.71.11.0/27, 52.82.128.0/19, 54.230.0.0/17, 54.230.128.0/18, 54.239.128.0/18, 130.176.224.0/20, 36.103.232.128/26, 52.84.0.0/15, 143.204.0.0/16, 144.220.0.0/16, 120.52.153.192/26, 119.147.182.0/25, 120.232.236.0/25, 54.182.0.0/16, 58.254.138.128/26, 120.253.245.192/27, 54.239.192.0/19, 18.64.0.0/14, 120.52.12.64/26, 99.84.0.0/16, 130.176.192.0/19, 52.124.128.0/17, 204.246.164.0/22, 13.35.0.0/16, 204.246.174.0/23, 36.103.232.0/25, 119.147.182.128/26, 118.193.97.128/25, 120.232.236.128/26, 204.246.176.0/20, 65.8.0.0/16, 65.9.0.0/17, 120.253.241.160/27, 64.252.64.0/18
I'd like to note that by default, Security Groups only allow a maximum of 60 inbound and outbound rules each, which is why I'm splitting these up 122 IPs into 3 security groups.
After creating your 3 Security Groups, attach them to your EC2 (you can attach multiple Security Groups to an EC2). I left the EC2's default Security Group to only allow SSH traffic from my IP address.
Then you should be good to go! This forces users to use your CloudFront distribution and keeps your EC2's IP/DNS private.

AWS have finally created an AWS managed prefix list for CloudFront to Origin server requests. So no more need for custom Lambdas updating Security Groups etc.
Use the prefix com.amazonaws.global.cloudfront.origin-facing in your Security Groups etc.
See the following links for more info:
The What's New Announcement
The Documentation

Related

Laravel forcing Http for asssets

this is a little bit strange because most of the questions here wanted to force https.
While learning AWS elastic beanstalk. I am hosting a laravel site there. Everything is fine, except that none of my javascripts and css files are being loaded.
If have referenced them in the blade view as :
<script src="{{asset('assets/backend/plugins/jquery/jquery.min.js')}}"></script>
First thing I tried was looking into the file/folder permissions in the root of my project by SSHing into EC2 instance. Didn't work even when I set the permission to public folder to 777.
Later I found out that, the site's main page url was http while all the assets url were 'https'.
I dont want to get into the SSL certificates things just yet, if it is possible.
Is there anyway I can have my assets url be forced to Http only?
Please forgive my naiveity. Any help would be appreciated.
This usually happens if your site is for example behind an reverse proxy, As the URL helper facade, trusts on your local instance that is beyond the proxy, and might not use SSL. Which can be misleading/wrong.
Which is probaly the case on a EC2 instance... as the SSL termination is beyond load balancers/HA Proxies.
i usually add the following to my AppServiceProvider.php
public function boot()
{
if (Str::startsWith(config('app.url'), 'https')) {
\URL::forceScheme('https');
} else {
\URL::forceScheme('http');
}
}
Of course this needs to ensure you've set app.url / APP_URL, if you are not using that, you can just get rid of the if statement. But is a little less elegant, and disallows you to develop on non https

Caching all images on external site through Cloudflare

Here is my situation:
I have a webapp that uses a lot of images on a remote server. My webapp is behind Cloudflare, although the server that the images are hosted on are not.. and this server can be very slow. It can sometimes take about 5 seconds per image.
I would like to use Cloudflare to proxy requests to this external server, but also cache them indefinitely, or at least as long as possible. The images never change so I do not mind them having a long cache life.
Is this something I should set up in a worker? As a page rule? Or just not use CLoudflare in this way?
If you can't change origin server headers, you could try following snippet in your worker:
fetch(event.request, { cf: { cacheTtl: 300 } })
As per docs:
This option forces Cloudflare to cache the response for this request,
regardless of what headers are seen on the response. This is
equivalent to setting two page rules: “Edge Cache TTL” and “Cache
Level” (to “Cache Everything”).
I think you generally just want a very long caching header on your images. Something like:
Cache-Control: public; max-age=31536000

Swagger page being redirected from https to http

AWS Elastic Load Balancer listening through HTTPS (443) using SSL and redirecting requests to EC2 instances through HTTP (80), with IIS hosting a .net webapi application, using swashbuckle to describe the API methods.
Home page of the API (https://example.com) has a link to Swagger documentation which can bee read as https://example.com/swagger/ui/index.html when you hove over on the link.
If I click on the link it redirects the request on the browser to http://example.com/swagger/ui/index.html which displays a Page Not Found error
but if I type directly in the browser URL https://example.com/swagger/ui/index.html then it loads Swagger page, but then, when expanding the methods an click on "Try it out", the Request URL starts with "http" again.
This configuration is only for Stage and Production environments. Lower environments don't use the load balancer and just use http.
Any ideas on how to stop https being redirected to http? And how make swagger to display Request URLs using https?
Thank you
EDIT:
I'm using a custom index.html file
Seems is a known issue for Swashbuckle. Quote:
"By default, the service root url is inferred from the request used to access the docs. However, there may be situations (e.g. proxy and load-balanced environments) where this does not resolve correctly. You can workaround this by providing your own code to determine the root URL."
What I did was provide the root url and/or scheme to use based on the environment
GlobalConfiguration.Configuration
.EnableSwagger(c =>
{
...
c.RootUrl(req => GetRootUrlFromAppConfig(req));
...
c.Schemes(GetEnvironmentScheme());
...
})
.EnableSwaggerUi(c =>
{
...
});
where
public static string[] GetEnvironmentScheme()
{
...
}
public static string GetRootUrlFromAppConfig(HttpRequestMessage request)
{
...
}
The way I would probably do it is having a main file, and generating during the build of your application a different swagger file based on the environnement parameters for schemes and hosts.
That way, you have to manage only one swagger file accross your environments, and you only have to manage a few extra environnement properties, host and schemes (if you don't already have them)
Since I don't know about swashbuckle, I cannot answer for sure at your first question (the redirect)

How to enable CORS on Sonatype Nexus?

I want to develop a Monitoring-WebApp for different things with AngularJS as Frontend. One of the core-elements is showing an overview of Nexus-Artifacts/Repositories.
When I request the REST-API I'm getting following error back:
No 'Access-Control-Allow-Origin' header is present on the requested resource.
Origin 'http://localhost:9090' is therefore not allowed access.
To fix this error, I need to modify the response headers to enable CORS.
It would be great if anyone is familiar with that type of problem and could give me an answer!
The CORS headers are present in the response of the system you are trying to invoke. (Those are checked on the client side [aka browser this case], you can implement a call on your backend to have those calls and there you can ignore those headers, but that could become quite hard to maintain.) To change those you'll need a proxy. So your application will not call the url directly like
fetch("http://localhost:9090/api/sometest")
There are at least two ways: one to add a proxy directly before the sonar server and modify the headers for everyone. I do not really recommend this because of security reasons. :)
The other more maintaneable solution is to go through the local domain of the monitoring web app as follows:
fetch("/proxy/nexus/api/sometest")
To achieve this you need to setup a proxy where your application is running. This could map the different services which you depend on, and modify the headers if necessary.
I do not know which application http server are you going to use, but here are some proxy configuration documentations on the topic:
For Apache HTTPD mod_proxy you could use a configuration similar to this:
ProxyPass "/proxy/nexus/" "http://localhost:9090/"
ProxyPassReverse "/proxy/nexus/" "http://localhost:9090/"
It is maybe necessary to use the cookies as well so you may need to take a look at the following configurations:
ProxyPassReverseCookiePath
ProxyPassReverseCookieDomain
For Nginx location you could employ something as follows
location /proxy/nexus/ {
proxy_pass http://localhost:9090/;
}
For node.js see documentation: https://github.com/nodejitsu/node-http-proxy
module.exports = (req, res, next) => {
proxy.web(req, res, {
target: 'http://localhost:4003/',
buffer: streamify(req.rawBody)
}, next);
};

How to tell cloudfront to not cache 302 responses from S3 redirects, or, how else to workaround this image caching generation issue

I'm using Imagine via the LIIPImagineBundle for Symfony2 to create cached versions of images stored in S3.
Cached images are stored in an S3 web enabled bucket served by CloudFront. However, the default LIIPImagineBundle implementation of S3 is far too slow for me (checking if the file exists on S3 then creating a URL either to the cached file or to the resolve functionality), so I've worked out my own workflow:
Pass client the cloudfront URL where the cached image should exist
Client requests the image via the cloudfront URL, if it does not exist then the S3 bucket has a redirect rule which 302 redirects the user to an imagine webserver path which generates the cached version of the file and saves it to the appropriate location on S3
The webserve 301 redirects the user back to the cloudfront URL where the image is now stored and the client is served the image.
This is working fine as long as I don't use cloudfront. The problem appears to be that cloudfront is caching the 302 redirect response (even though the http spec states that they shouldn't). Thus, if I use cloudfront, the client is sent in an endless redirect loop back and forth from webserver to cloudfront, and every subsequent request to the file still redirects to the webserver even after the file has been generated.
If I use S3 directly instead of cloudfront there are no issues and this solution is solid.
According to Amazon's documentation S3 redirect rules don't allow me to specify custom headers (to set cache-control headers or the like), and I don't believe that CloudFront allows me to control the caching of redirects (if they do it's well hidden). CloudFront's invalidation options are so limited that I don't think they will work (can only invalidate 3 objects at any time)...I could pass an argument back to cloudfront on the first redirect (from the Imagine webserver) to fix the endless redirect (eg image.jpg?1), but subsequent requests to the same object will still 302 to the webserver then 301 back to cloudfront even though it exists. I feel like there should be an elegant solution to this problem but it's eluding me. Any help would be appreciated!!
I'm solving this same issue by setting the "Default TTL" in CloudFront "Cache Behavior" settings to 0, but still allowing my resized images to be cached by setting the CacheControl MetaData on the S3 file with max-age=12313213.
This way redirects will not be cached (default TTL behavior) but my resized images will be (CacheControl max-age on s3 cache hit).
If you really need to use CloudFront here, the only thing I can think of is that you don’t directly subject the user to the 302, 301 dance. Could you introduce some sort of proxy script / page to front S3 and that whole process? (or does that then defeat the point).
So a cache miss would look like this:
Visitor requests proxy page through Cloudfront.
Proxy page requests image from S3
Proxy page receives 302 from S3, follows this to Imagine web
server
Ideally just return the image from here (while letting it update
S3), or follow 301 back to S3
Proxy page returns image to visitor
Image is cached by Cloudfront
TL;DR: Make use of Lambda#Edge
We face the same problem using LiipImagineBundle.
For development, an NGINX serves the content from the local filesystem and resolves images that are not yet stored using a simple proxy_pass:
location ~ ^/files/cache/media/ {
try_files $uri #public_cache_fallback;
}
location #public_cache_fallback {
rewrite ^/files/cache/media/(.*)$ media/image-filter/$1 break;
proxy_set_header X-Original-Host $http_host;
proxy_set_header X-Original-Scheme $scheme;
proxy_set_header X-Forwarded-For $remote_addr;
proxy_pass http://0.0.0.0:80/$uri;
}
As soon as you want to integrate CloudFront things get more complicated due to caching. While you can easily add S3 (static website, see below) as a distribution, CloudFront itself will not follow the resulting redirects but return them to the client. In the default configuration CloudFront will then cache this redirect and NOT the desired image (see https://stackoverflow.com/a/41293603/6669161 for a workaround with S3).
The best way would be to use a proxy as described here. However, this adds another layer which might be undesirable. Another solution is to use Lambda#Edge functions as (see here). In our case, we use S3 as a normal distribution and make use of the "Origin Response"-Event (you can edit them in the "Behaviors" tab of your distribution). Our Lambda function just checks if the request to S3 was successful. If it was, we can just forward it. If it was not, we assume that the desired object was not yet created. The lambda function then calls our application that generates the object and stores it in S3. For simplicity, the application replies with a redirect (to CloudFront again), too - so we can just forward that to the client. A drawback is that the client itself will see one redirect. Also make sure to set the cache headers so that CloudFront does not cache the lambda redirect.
Here is an example Lambda Function. This one just redirects the client to the resolve url (which then redirects to CloudFront again). Keep in mind that this will result in more round trips for the client (which is not perfect). However, it will reduce the execution time of your Lambda function. Make sure to add the Base Lambda#Edge policy (related tutorial).
env = {
'Protocol': 'http',
'HostName': 'localhost:8000',
'HttpErrorCodeReturnedEquals': '404',
'HttpRedirectCode': '307',
'KeyPrefixEquals': '/cache/media/',
'ReplaceKeyPrefixWith': '/media/resolve-image-filter/'
}
def lambda_handler(event, context):
response = event['Records'][0]['cf']['response']
if int(response['status']) == int(env['HttpErrorCodeReturnedEquals']):
request = event['Records'][0]['cf']['request']
original_path = request['uri']
if original_path.startswith(env['KeyPrefixEquals']):
new_path = env['ReplaceKeyPrefixWith'] + original_path[len(env['KeyPrefixEquals']):]
else:
new_path = original_path
location = '{}://{}{}'.format(env['Protocol'], env['HostName'], new_path)
response['status'] = env['HttpRedirectCode']
response['statusDescription'] = 'Resolve Image'
response['headers']['location'] = [{
'key': 'Location',
'value': location
}]
response['headers']['cache-control'] = [{
'key': 'Cache-Control',
'value': 'no-cache' # Also make sure that you minimum TTL is set to 0 (for the distribution)
}]
return response
If you just want to use S3 as a cache (without CloudFront). Using static website hosting and a redirect rule will redirect clients to the resolve url in case of missing cache files (you will need to rewrite S3 Cache Resolver urls to the website version though):
<RoutingRules>
<RoutingRule>
<Condition><HttpErrorCodeReturnedEquals>403</HttpErrorCodeReturnedEquals>
<KeyPrefixEquals>cache/media/</KeyPrefixEquals>
</Condition>
<Redirect>
<Protocol>http</Protocol>
<HostName>localhost</HostName>
<ReplaceKeyPrefixWith>media/image-filter/</ReplaceKeyPrefixWith>
<HttpRedirectCode>307</HttpRedirectCode>
</Redirect>
</RoutingRule>
</RoutingRules>

Resources