Amazon s3 404 pages are cached on cloudflare cdn - caching

I am using cloudflare dns subdomain which is pointing to amazon s3 bucket. The problem I am facing is cloudflare cache 404 response from amazon s3. Even I upload image to amazon s3 , cloudflare always response in 404 because the previous response is cached . I want to use cloudflare cache because of performace reasons but I don't want to manually clear cloudflare cache for 404 urls.
It is obvious that if amazon s3 is responding with 404 then it is no use to cache that url.
May be I am skipping some cloudflare setting which do this.

CloudFlare actually caches 404s for about ten minutes (lightens the potential load on your server). Have you looked at purging your cache as one workaround?

I have this problem not only for S3, but for web server responses.
Seems like the solution they give is to add a no-cache header on the server response for 404s.

Related

Request to Spring Boot application via Cloudfront fails inexplicably with 403 status

When I navigate to web.mysite.com, a static SPA hosted in S3, it has an iframe which has a src of mysite.com/some/path, which is a Spring Boot MVC application in Elastic Beanstalk. Both are behind Cloudfront distributions for HTTPS. This path is handled in the application with a custom resource resolver. This loads successfully, but inside the iframe content there is a script tag looking for mysite.com/some/path/thatsdifferent, handled by the same resolver.
This second request fails with a 403 and I cannot determine why. Navigating to the failing mysite.com/some/path/thatsdifferent directly in my browser or using postman succeeds with a 200 status. The server is configured to allow requests from web.mysite.com through CORS configuration (and there is no CORS-related error message) and Spring Security is configured to permitAll any requests to /some/** regardless of authentication. There is no response body or error message beyond the header x-cache: Error from cloudfront.
If I navigate to the-beanstalk-env-url.com/some/path, it loads the html and then successfully loads the content from the-beanstalk-env-url.com/some/path/thatsdifferent.
Requests to a few different but similar paths succeed. Going to a path which definitely 100% does not exists returns a 404.
The server logs show that the request is being successfully handled and Cloudfront is returning reasonable responses to the client. Looking at the Cloudfront logs simply reports a 403, without any additional information.
Almost 100% of Cloudfront 403 error articles and questions involve S3, which is not the part which is failing here.
Changing the Cloudfront distribution Allowed Methods from GET, HEAD to GET, HEAD, OPTIONS causes the requests directly to mysite.com/some/path/thatsdifferent to begin failing with invalid CORS request, this was fixed by whitelisting the Accept, Authorization, Host, Origin and Referer headers. This did not fix the underlying error.
Adjusting the logging for org.springframework.security doesn't log any extra information when a failing request occurs, my application security configuration is not what is causing the error.
After replacing Cloudfront with a load balancer on my environment in Route 53, the scenario works as expected, so the problem is definitely in Cloudfront.
The solution was to switch the Cloudfront Origin Protocol policy from HTTP Only to HTTPS Only.
I don't know why this mattered from the script file and not the html file, but I decided to test it out when I discovered that if I tried to connect to the Beanstalk environment URL via https, Chrome was warning me that the certificate being used was setup for the domain that was served by the Cloudfront distribution that was causing trouble.

Azure CDN "looses" requests

We're using Azure CDN (Verizon Standard) to serve images to ecommerce sites, however, we're experiencing unreasonable amount of loads from origin, images which should've been cached in the CDN is requested again multiple times.
Images seems to stay in the cache if they're requested very frequently (setting up a pingdom page speed test doesn't show the problem, it's executing every 30 minutes).
Additionally, if I request an image (using the browser), the scaled image is requested from the origin and delivered, but the second request doesn't return a cached file from the CDN but origin is called again. The third request returns from the CDN.
The origin is a web app which scales and delivers the requested images. All requests for images have the following headers which might affect caching:
cache-control: max-age=31536000, s-maxage=31536000
ETag: e7bac8d5-3433-4ce3-9b09-49412ac43c12?cache=Always&maxheight=3200&maxwidth=3200&width=406&quality=85
Since we want the CDN to cache the scaled image, Azure CDN Endpoint is configured to cache every unique url and the caching behaviour is "Set if missing" (although all responses have the headers above).
Using the same origin with AWS Cloudfront works perfectly (but since we have everything else in Azure, it would be nice to make it work). I haven't been able to find if there's any limit or constraints for the ETag but since it works with AWS it seems like I'm missing something related to either Azure or Verizon.

Does Varnish work with https after enabling cloudflare?

I installed Varnish 5.2 on my vps and i'm using cloudflare and https.
I added a rule to cache html pages for 1 hour then i tested the cache.
All pages were cached and expired after 1 hour and everything works fine.
It's known that Varnish doesn't work with https, so did i miss something?
I'm using Wordpress, the site is a very simple blog, only admins login to the site.
You haven't provided many details but you're probably talking about Cloudflare's service Flexible SSL. The traffic from client to CF is encrypted but the traffic from CF to your Varnish server goes over HTTP with an X-Forwarded-Proto header to inform the backend server that the original request was over HTTPS and to render the website as such.

Can i write redirect rules in Amazon cloudfront?

I am using Apache webserver's mod_rewrite module to redirect from the old domain to new domain. Example
RewriteRule /page1.html /page2.html [L]
I am using Amazon cloudfront to cache these files. Please let me know can i write similar rules in Amazon cloudfront so that i can avoid these traffic to webserver.
Thanks
Siva.
No you can't - but there's no need to configure the re-directs in CloudFront.
Get your origin webserver to issue the redirects you need, and CloudFront will cache them for you (after all it is just another HTTP response), Once the redirect has been served and is cached, subsequent requests for a resource are served the redirect from CloudFront's cache - rather than your webserver.

Amazon EC2 serves gzipped JavaScript. But Cloudfront does not. Why?

I have an Amazon EC2 Web Server instance which serves gzipped content when the Accept-Encoding header is set to gzip. But when I make the same request with the exact same header to a CloudFront CDN with the origin server as my Amazon EC2 instance, it doesn't send back a gzipped response.
I also tried creating a new CloudFront distribution(because I thought that the old distribution might have uncompressed response cached) and then making the same request and I still get an uncompressed response.
Can someone please tell me what I may be missing?
This has been marked as a possible duplicate of a question relating to S3. The question is around EC2 - not S3, so I don't think this is a duplicate.
You’re likely seeing this issue due to Cloudfront adding a ‘Via’ header to the requests made to your origin server - it’s a know issue with IIS.
If you were to look at the incoming HTTP requests to your origin, you’d see something like this in your HTTP headers:
Via=1.1 9dc1db658f6cee1429b5ff20764c5b07.cloudfront.net (CloudFront)
X-Amz-Cf-Id=k7rFUA2mss4oJDdT7rA0HyjG_XV__XwBV14juZ8ZAQCrbfOrye438A==
X-Forwarded-For=121.125.239.19, 116.127.54.19
The addition of a ‘Via’ header is standard proxy server behaviour. When IIS sees this, it drops the gzip encryption (I’m guessing due to an assumption that older proxy servers couldn’t handle compressed content).
If you make the following changes to your applicationHost.config, you should rectify the issue:
<location path="Your Site">
<system.webServer>
<httpCompression noCompressionForHttp10="false" noCompressionForProxies="false" />
</system.webServer>
</location>
The other issue to watch out for is that IIS doesn’t always compress the first response it receives for a given resource, therefore, Cloudfront may make a request to the origin, receive, cache and then serve uncompressed version of the content to subsequent visitors. Again you can modify this behaviour using the serverRuntime settings in the applicationHost.config:
<location path="Your Site">
<system.webServer>
<httpCompression noCompressionForHttp10="false" noCompressionForProxies="false" />
<serverRuntime frequentHitThreshold="1" frequentHitTimePeriod="00:00:05" />
</system.webServer>
More details on these settings here:
http://www.iis.net/configreference/system.webserver/serverruntime
http://www.iis.net/configreference/system.webserver/httpcompression
Credit to this blog post for explaining the issue:
http://codepolice.net/2012/06/26/problems-with-gzip-when-using-iis-7-5-as-an-origin-server-for-a-cdn/

Resources