Cached content expiring too frequently on fastly CDN - heroku

I'm using Fastly as a CDN in front of my Heroku application, and am seeing many requests that I expect to be cached make it through.
An example of this behavior is two requests to the URL:
https://nuu-acceptance-herokuapp-com.global.ssl.fastly.net/attachments/f092ff0398b3bace19fae21b17a22320c3da5428/store/fit/240/160/28515a2fa2e47b59f13b2044ea5b9a7c8c9587ceca7d7dfadb28f08730f7/file.jpg. Here are two responses from the requests, which occurred fifteen minutes apart:
RESPONSE 1:
----------
Strict-Transport-Security: max-age=31536000
Content-Encoding: gzip
X-Content-Type-Options: nosniff
Age: 0
Transfer-Encoding: chunked
X-Cache: MISS
X-Cache-Hits: 0
Content-Disposition: inline; filename="file.jpg"
Connection: keep-alive
Via: 1.1 vegur
Via: 1.1 varnish
X-Request-Id: bc766069-c2ca-4a66-ba88-a8d76da72e2d
X-Served-By: cache-sjc3124-SJC
X-Runtime: 3.711698
Last-Modified: Tue, 23 Jun 2015 18:44:27 GMT
Server: Cowboy
X-Timer: S1435085062.909546,VS0,VE4437
Date: Tue, 23 Jun 2015 18:44:27 GMT
Vary: Accept-Encoding
Content-Type: image/jpeg
Access-Control-Allow-Origin: *
Cache-Control: public, must-revalidate, max-age=31536000
Set-Cookie: __profilin=; path=/; max-age=0; expires=Thu, 01 Jan 1970 00:00:00 -0000; secure
Accept-Ranges: bytes
Expires: Wed, 22 Jun 2016 18:44:27 GMT
----------
RESPONSE 2:
----------
Strict-Transport-Security: max-age=31536000
Content-Encoding: gzip
X-Content-Type-Options: nosniff
Age: 0
Transfer-Encoding: chunked
X-Cache: MISS
X-Cache-Hits: 0
Content-Disposition: inline; filename="file.jpg"
Connection: keep-alive
Via: 1.1 vegur
Via: 1.1 varnish
X-Request-Id: 60ee54b0-9509-42c5-9b03-c0f5854c5524
X-Served-By: cache-sjc3135-SJC
X-Runtime: 0.251021
Last-Modified: Tue, 23 Jun 2015 18:57:44 GMT
Server: Cowboy
X-Timer: S1435085863.749442,VS0,VE560
Date: Tue, 23 Jun 2015 18:57:44 GMT
Vary: Accept-Encoding
Content-Type: image/jpeg
Access-Control-Allow-Origin: *
Cache-Control: public, must-revalidate, max-age=31536000
Accept-Ranges: bytes
Expires: Wed, 22 Jun 2016 18:57:44 GMT
Both are cache misses, even though I expect this content to be cached for a year. It also appears that the same Fastly cluster handled the request. Can anyone point me to what I might be doing wrong? I'm seeing this behavior across many files served by Fastly - fastly seems to serve the files intermittently, but there are cache misses much more often than I expect.
I'd appreciate any help that anyone could give me with this - thanks!

If you look at the HTTP headers from your responses, you will see that they both contain a Set-Cookie header. Responses with cookies will not be cached by Fastly. You can remove them, however, in your app or within your Fastly configuration.

Related

Character Encoding Issue when activating Cloudflare

When I activate cloudflare we have an encoding or caching issue whereby special characters appear all over the page.
Header response with cloudflare deactivated:
Alt-Svc: quic=":443"; ma=2592000; v="35,39,43,44"
Cache-Control: no-cache, must-revalidate
Connection: close
Content-Encoding: gzip
Content-Length: 8156
Content-Type: text/html; charset=UTF-8
Date: Wed, 14 Aug 2019 14:19:31 GMT
Expires: Mon, 26 Jul 1997 05:00:00 GMT
Last-Modified: Wed, 14 Aug 2019 14:19:31 GMT
Pragma: no-cache
Server: LiteSpeed
Set-Cookie: pmd_template=deleted; expires=Thu, 01-Jan-1970 00:00:01 GMT; Max-Age=0; path=/New/; domain=www.eastlondonbusinessdirectory.co.za
Set-Cookie: pmd_template=listimia; expires=Fri, 13-Sep-2019 14:19:31 GMT; Max-Age=2592000; path=/New/; domain=www.eastlondonbusinessdirectory.co.za
Vary: Accept-Encoding,User-Agent
X-Powered-By: PHP/7.0.33
Header response with cloudflare activated:
Cache-Control: no-cache, must-revalidate
CF-RAY: 50639e64dd188074-CPT
Connection: keep-alive
Content-Encoding: zlib,gzip,deflate
Content-Type: text/html; charset=UTF-8
Date: Wed, 14 Aug 2019 14:29:03 GMT
Expect-CT: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
Expires: Mon, 26 Jul 1997 05:00:00 GMT
Last-Modified: Wed, 14 Aug 2019 14:29:02 GMT
Pragma: no-cache
Server: cloudflare
Set-Cookie: pmd_template=deleted; expires=Thu, 01-Jan-1970 00:00:01 GMT; Max-Age=0; path=/New/; domain=www.eastlondonbusinessdirectory.co.za
Set-Cookie: pmd_template=listimia; expires=Fri, 13-Sep-2019 14:29:02 GMT; Max-Age=2592000; path=/New/; domain=www.eastlondonbusinessdirectory.co.za
Transfer-Encoding: chunked
Vary: User-Agent
X-Powered-By: PHP/7.0.33
X-Turbo-Charged-By: LiteSpeed
I believe I might need to make sure my origin server sends a header that tells the cache to serve pages based on the content encoding, but I'm not too sure if this logic is correct because with Cloudflare activated I only see Vary: User Agent? which is in fact ignored by Cloudflare... If the logic is correct then I'm not too sure how to go about fixing this. I have tried to add a page rule from Cloudflare to cache everything and also added the following in .htaccess
</IfModule>
AddDefaultCharset UTF-8
<IfModule mod_headers.c>
<FilesMatch ".(js|css|xml|gz|html)$">
Header append Vary: Accept-Encoding
</FilesMatch>
</IfModule>
but both does not work.
Any help will be much appreciated to get this issue resolved and will accept the answer
Please help. Thank you
Cloudflare currently only respects the Accept-Encoding vary header.
If you want to vary based on other factors, you can either consider:
Custom caching set up for Enterprise Plan
"Bypass Cache" entirely using Page Rules
Serve different content types from distinct URLs to continue leveraging caching
Workaround using Cloudflare Workers

webhdfs rest api throwing file not found exception

I am trying to open a hdfs file that is present on cdh4 cluster from cdh5 machine using webhdfs from the command line as below:
curl -i -L "http://namenodeIpofCDH4:50070/webhdfs/v1/user/quad/source/JSONML.java?user.name=quad&op=OPEN"
I am getting "File Not Found Exception" even if the file JSONML.java is present in the mentioned path in namenode as well as datanode and its trace is as follows:
HTTP/1.1 307 TEMPORARY_REDIRECT
Cache-Control: no-cache
Expires: Thu, 01-Jan-1970 00:00:00 GMT
Date: Mon, 22 Feb 2016 13:25:35 GMT
Pragma: no-cache
Date: Mon, 22 Feb 2016 13:25:35 GMT
Pragma: no-cache
Set-Cookie: hadoop.auth="u=quad&p=quad&t=simple&e=1456183535737&s=KdZYcA5iwJeIU2F9ZJfLSaT4qMY=";Path=/
Location: http://n3.quadratics.com:50075/webhdfs/v1/user/quad/source/JSONML.java?op=OPEN&user.name=quad&namenoderpcaddress=n1.quadratics.com:8020&offset=0
Content-Type: application/octet-stream
Content-Length: 0
Server: Jetty(6.1.26.cloudera.4)
HTTP/1.1 404 Not Found
Cache-Control: no-cache
Expires: Mon, 22 Feb 2016 13:26:28 GMT
Date: Mon, 22 Feb 2016 13:26:28 GMT
Pragma: no-cache
Expires: Mon, 22 Feb 2016 13:26:28 GMT
Date: Mon, 22 Feb 2016 13:26:28 GMT
Pragma: no-cache
Content-Type: application/json
Transfer-Encoding: chunked
Server: Jetty(6.1.26.cloudera.4)
{"RemoteException":{"exception":"FileNotFoundException","javaClassName":"java.io.FileNotFoundException","message":"File does not exist: /user/quad/source/JSONML.java\n\tat org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66)\n\tat org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56)\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1932)\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1873)\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1853)\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1825)\n\tat org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:559)\n\tat org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:87)\n\tat org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:363)\n\tat org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)\n\tat org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)\n\tat org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)\n\tat org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)\n\tat org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040)\n\tat java.security.AccessController.doPrivileged(Native Method)\n\tat javax.security.auth.Subject.doAs(Subject.java:415)\n\tat org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)\n\tat org.apache.hadoop.ipc.Server$Handler.run(Server.java:2038)\n"}}
But I don't get any error and get the status of the above file when I use the below command:
curl -i -L http://namenodeIpofCDH4:50070/webhdfs/v1/user/quad/source/JSONML.java?user.name=quad&op=GETFILESTATUS"
I get the output response as below:
HTTP/1.1 200 OK
Cache-Control: no-cache
Expires: Thu, 01-Jan-1970 00:00:00 GMT
Date: Mon, 22 Feb 2016 13:38:48 GMT
Pragma: no-cache
Date: Mon, 22 Feb 2016 13:38:48 GMT
Pragma: no-cache
Set-Cookie: hadoop.auth="u=quad&p=quad&t=simple&e=1456184328134&s=sE6esO8J39O+itl+ggNzX4/WzjQ=";Path=/
Content-Type: application/json
Transfer-Encoding: chunked
Server: Jetty(6.1.26.cloudera.4)
{"FileStatus":{"accessTime":1456147448567,"blockSize":134217728,"group":"quad","length":14849,"modificationTime":1456143798039,"owner":"quad","pathSuffix":"","permission":"644","replication":3,"type":"FILE"}}
Any ideas of the reason of why opening a file is failing and fixing that would be greatly appreciated.
I saw a similar error when I had misconfigured my /etc/hosts
The OPEN command above returns a redirect which provides a hostname. The localhost will try and resolve this hostname based on the local DNS setup.
It will then look for the file at the IP address that the hostname resolves to. Not necessarily the one you issued the command to.

How to cache API calls via Varnish?

My goal is to cache RESTful API calls via Varnish. AS I was reading on stackoverflow and other resources, Varnish can not cache post requests. This is exactly what I am experiencing. Therefore I moved to get with ?id=30 but then I realized that those also do not get cached because of the question mark.
So the question is, how o cache API-Calls over Varnish?
Here are two example calls to my API, secured by Oauth2 with 2 parameters pased by post:
curl --insecure -s -k https://test-api/v1/test -d 'access_token=72f50e68a0aed7921c6cb058de8e7e6ed4ebd692&clid=585970' -D- -o/dev/null
HTTP/1.1 200 OK
Date: Sat, 29 Aug 2015 13:02:36 GMT
Server: Apache/2.2.31 (Unix) PHP/5.6.11
X-Powered-By: PHP/5.6.11
Vary: Accept-Encoding
Content-Length: 2121
Content-Type: application/json
X-Varnish: 6558010
Age: 0
Via: 1.1 varnish-v4
Accept-Ranges: bytes
Set-Cookie: SERVERID=S3; path=/
curl --insecure -s -k https://test-api/v1/test -d 'access_token=72f50e68a0aed7921c6cb058de8e7e6ed4ebd692&clid=585970' -D- -o/dev/null
HTTP/1.1 200 OK
Date: Sat, 29 Aug 2015 13:02:56 GMT
Server: Apache/2.2.31 (Unix) PHP/5.6.11
X-Powered-By: PHP/5.6.11
Vary: Accept-Encoding
Content-Length: 2121
Content-Type: application/json
X-Varnish: 12814168
Age: 0
Via: 1.1 varnish-v4
Accept-Ranges: bytes
Set-Cookie: SERVERID=S2; path=/
Is it possible to configure Varnish to cache the API calls? POST/GET either way I don't mind.

Setting Access-Control-Allow-Origin on Cloudfront

I am having problems serving static assets to Firefox using AWS Cloudfront.
Chrome works perfect, but Firefox is returning a CORS error.
If I execute curl , I get:
HTTP/1.1 200 OK
Content-Type: application/x-font-opentype
Content-Length: 39420
Connection: keep-alive
Date: Mon, 11 Aug 2014 21:53:50 GMT
Cache-Control: public, max-age=31557600
Expires: Sun, 09 Aug 2015 01:28:02 GMT
Last-Modified: Fri, 08 Aug 2014 19:28:05 GMT
ETag: "9df744bdf9372cf4cff87bb3e2d68fc8"
Accept-Ranges: bytes
Server: AmazonS3
Age: 2743
X-Cache: Hit from cloudfront
Via: 1.1 c445b20dfbf3128d810e975e5d84e2cd.cloudfront.net (CloudFront)
X-Amz-Cf-Id: ...
Which I think needs the header:
Access-Control-Allow-Origin: *
Can anyone help me? Why is it a problem on Firefox and not Chrome? How can I solve it?
Have you configured your distribution to support CORS by setting Origin header to be forwarded?
Reference:
http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/header-caching.html#header-caching-web-cors

Drive Realtime API no longer returns realtime document on localhost

I have been calling the following gapi javascript function with great success for a few months:
gapi.drive.realtime.load(fileId,
successHandler,
initializer,
errorHandler);
Suddenly, at 1:30 PM CDT today, that call stopped working when run in javascript on localhost. I can deploy the exact same code to my server and it works perfectly!
Frustratingly, none of the callbacks are called - not successHandler OR errorHandler.
I have localhost:3000 set as an allowed javascript origin in my Google API Console project, and anyway I haven't changed any settings there since this was working. I am correctly authorized and can make REST calls to the Drive API without an issue.
Has anyone else seen this behavior suddenly? Can anyone from the Google team make a suggestion?
Update: the request inspector shows a GET to
https://drive.google.com/otservice/gs?access_token=[ommitted-for-stackoverflow]&id=[also-omitted]
with the response
)]}'
["17AKDsTY8kHESKfQavrHeh3YybD5k4b6ty8CQ78MHtyc","724b79b808d48070",false,1,[1,""],[0,[28,"724b79b808d48070","110581799581534438628",false,true,"REL DEV","#58B442","https://lh3.googleusercontent.com/-XdUIqdMkCWA/AAAAAAAAAAI/AAAAAAAAAAA/4252rscbv5M/s128/photo.jpg"]]]
The headers are
HTTP/1.1 200 OK
status: 200 OK
version: HTTP/1.1
access-control-allow-origin: *
access-control-expose-headers: Content-Length,Content-Type,X-Restart
alternate-protocol: 443:quic
cache-control: no-cache, no-store, max-age=0, must-revalidate
content-disposition: attachment; filename="json.txt"; filename*=UTF-8''json.txt
content-encoding: gzip
content-type: application/json; charset=utf-8
date: Fri, 04 Apr 2014 22:15:40 GMT
expires: Fri, 01 Jan 1990 00:00:00 GMT
pragma: no-cache
server: GSE
vary: Origin
x-content-type-options: nosniff
x-frame-options: SAMEORIGIN
x-restart:
x-xss-protection: 1; mode=block
There are no other requests after that.

Resources