Cache Policy - caching only if request succeeded - caching

I have enabled some cache policies on a few resources end points. System works quite well, response is cached, the following requests hit the cache, cache is correctly refreshed when I set it to be refreshed.
My only concern is that sometimes a client makes a request that does not hit the cache (for example, because the cache must be refreshed), the server in that moment returns an error (it can happen, it's statistic...) and so the cached response is not a "normal" response (e.g. 2xx) but a 4xx, or a 5xx response.
I would like to know if it is possible to cache the response only if, for example, the server response code is 2xx.
I didn't find any example on Apigee docs for doing this, also if there are some parameters for the cache policy called "SkipCachePopulation" that I think I can use for this purpose.
Any suggestion?

Yes, you can use the SkipCachePopulation field of ResponseCache. It uses a condition to determine when the cache population will not occur. Here is an example:
<SkipCachePopulation>response.status.code >= 400</SkipCachePopulation>

Related

RefreshHit from cloudfront even with cache-control: max-age=0, no-store

Cloudfront is getting a RefreshHit for a request that is not supposed to be cached at all.
It shouldn't be cached because:
It has cache-control: max-age=0, no-store;
The Minimum TTL is 0; and
I've created multiple invalidations (on /*) so this cached resource isn't from some historical deploy
Any idea why I'm getting RefreshHits?
I also tried modifying Cache-Control to be cache-control no-store, stale-if-error=0, creating a new invalidation on /* and now I'm seeing a cache hit (this time in Firefox):
After talking extensively with support, they explained what's going on.
So, if you have no-store and a Minimum TTL of 0, then CloudFront will indeed not store your resources. However, if your Origin is taking a long time to respond (so likely under heavy load), while CloudFront waits for the response to the request, if it gets another identical request (identical with respect to the cache key), then it'll send the one response to both requests. This is in order to lighten the load on the server. (see docs)
Support was calling these "collapse hits" although I don't see that in the docs.
So, it seems you can't have a single Behavior serving some pages that must have a unique response per request while serving other pages that are cached. Support said:
I just confirmed that, with min TTL 0 and cache-control: no-store, we cannot disable collapse hit. If you do need to fully disable cloudfront cache, you can use cache policy CachingDisabled
We'll be making a behavior for every path prefix that we need caching on. It seems there was no better way than this for our use-case (transitioning our website one page at a time from a non-cacheable, backend-rendered jinja2/jQuery to a cacheable, client-side rendered React/Next.js).
It's probably too late for OP's project, but I would personally handle this with a simple origin-response Lambda#Edge function, and a single cache behavior for /* and cache policy. You can write all of the filtering/caching logic in the origin-response function. That way you only manage one bit of function code in one place, instead of a bunch of individual cache behaviors (and possibly a bunch of cache policies).
For example, an origin-response function that looks for a cache-control response header coming from your origin. If it exists, pass it back to the client. However if it doesn't exist (or if you want to overwrite it with something else) then you can create the response header there. The edge doesn't care if the cache-control header came from your origin, or from an origin-response Lambda. To the edge, it is all the same.
Another trick you can use in order to avoid caching and still use the default CloudFront behavior, is: Have a dummy unused query parameter that equals to a unique value for each request.
Python example:
import requests
import uuid
requests.get(f'http://my-test-server-x.com/my/path?nochace={uuid.uuid4()}')
requests.get(f'http://my-test-server-x.com/my/path?nochace={uuid.uuid4()}')
Note that both calls will reach destination and will not get response from cache since the uuid.uuid4() will always generate a unique value
This works since by default (if not defined otherwise in the Behavior section) the query parameters are part of the cache key
Note: Doing so will avoid cache use, hence your backend might be loaded with requests.

Throttle HTTP Request based on Available Memory

I have a REST API that is expected to receive a large payload as request body. The API calls a blocking method that takes 2 seconds to process each request and then returns 200 OK. I wish to introduce throttling based on available memory such that the API returns 429 Too Many Request when the available memory falls below a threshold.
When the threshold condition is met, I wish to reject subsequent requests right away, even before loading the large request payloads in my application memory. This will also give me some protection against denial of service attacks.
In a Java EE, Tomcat environment, if I use a Filter to check available memory, I understand the complete request is already loaded in memory. Is it then better to add the check in ServletRequestListener.requestInitialized method so that I can reject the request even before the app receives it?
P.S. I use the below formula to calculate available memory based on this SO post:
long presumableFreeMemory =
Runtime.getRuntime().maxMemory()
- Runtime.getRuntime().totalMemory()
+ Runtime.getRuntime().freeMemory();

HTTP GET vs POST for Idempotent Reporting

I'm building a web-based reporting tool that queries but does not change large amounts of data.
In order to verify the reporting query, I am using a form for input validation.
I know the following about HTTP GET:
It should be used for idempotent requests
Repeated requests may be cached by the browser
What about the following situations?
The data being reported changes every minute and must not be cached?
The query string is very large and greater than the 2000 character URL limit?
I know I can easily just use POST and "break the rules", but are there definitive situations in which POST is recommended for idempotent requests?
Also, I'm submitting the form via AJAX and the framework is Python/Django, but I don't think that should change anything.
I think that using POST for this sort situation is acceptable. Citing the HTTP 1.1 RFC
The action performed by the POST method might not result in a
resource that can be identified by a URI. In this case, either 200
(OK) or 204 (No Content) is the appropriate response status,
depending on whether or not the response includes an entity that
describes the result.
In your case a "search result" resource is created on the server which adheres to the HTTP POST request specification. You can either opt to return the result resource as the response or as a separate URI to the just created resource and may be deleted as the result resource is no longer necessary after one minute's time(i.e as you said data changes every one minute).
The data being reported changes every minute
Every time you make a request, it is going to create a new resource based on your above statement.
Additionally you can return 201 status and a URL to retrieve the search result resource but I m not sure if you want this sort of behavior but I just provided as a side note.
Second part of your first question says results must not be cached. Well this is something you configure on the server to return necessary HTTP headers to force intermediary proxies and clients to not cache the result, for example, with If-Modified-Since, Cache-control etc.
Your second question is already answered as you have to use POST request instead of GET request due to the URL character limit.

What is it called when two requests are being served from the same cache?

I'm trying to find the technical term for the following (and potential solutions), in a distributed system with a shared cache:
request A comes in, cache miss, so we begin to generate the response
for A
request B comes in with the same cache key, since A is not
completed yet and hasn't written the result to cache, B is also a
cache miss and begins to generate a response as well
request A completes and stores value in cache
request B completes and stores value in cache (over-writing request A's cache value)
You can see how this can be a problem at scale, if instead of two requests, you have many that all get a cache miss and attempt to generate a cache value as soon as the cache entry expires. Ideally, there would be a way for request B to know that request A is generating a value for the cache, and wait until that is complete and use that value.
I'd like to know the technical term for this phenomenon, it's a cache race of sorts.
It's a kind of Thundering Herd
Solution: when first request A comes and fills a flag, if request B comes and finds the flag then wait... After A loaded the data into the cache, remove flag.
If all other request are waked up by the cache loaded event, would trigger all thread "Thundering Herd". So also need to care about the solution.
For example in Linux kernel, only one process would be waked up, even several process depends on the event.

How does adding a random number to the end of an AJAX server request prevent caching?

How exactly does adding a random number to the end of an AJAX server call prevent the database server or browser (not entirely sure which one is intended) from caching? why does this work?
It is intended to prevent client-side (or reverse proxy) caching.
Since the cache will be keyed on the exact request, by adding a random element to the request, the exact request URL should never be seen twice; so it won't be used more than once, and an intelligent cache won't bother keeping around something that's never been seen more than once, at least, not for long.
It's to prevent your browser (and to a reasonable amount, a web proxy) from caching requests. Typically, a query parameter - like ?rand2024= tells the browser/proxy to send the onward request with a parameter telling your application to behave differently. That's why such requests are useful to bust caches.
Your browser caches the web page keyed by the exact text of the URL, so adding a random-number parameter ensures that the URL is different every time - thus no real caching. Your browser doesn't know that the server is (hopefully) ignoring this parameter.

Resources