How to invalidate client-cache? - caching

If an application has client side caching, and data changes on server side, then how does client comes to know about it so that it can invalidate the cache?

If server send "Cache-Control: max-age=100" on response header after first action to get data from server then client save response data on local cache store.
If client send same request in about 100 seconds then response retrieved from local cache store and request dont send to server.
When pass over 100 seconds and send same request again to server, cache data will have been invalidate on local. Thus request arrive to server. If server recognize the request and decide to not modified on the source then do nothing and return response that's status is 304 (not modified).
Seeing this status, the client renews the validity period of the expired data and all requests they sent within 100 seconds are retrieved from the cache again.
This flow has client cache invalidate mechanism.
https://developer.mozilla.org/en-US/docs/Web/HTTP/Caching

Related

Expected caching behavior when 2 hits to same path on CloudFront

So lets say I have a CloudFront distribution and I call /path1 then /path1 again soon after. The flow will be
1st request is a CloudFront miss and goes to server
2nd request is a CloudFront hit
But what if theres 2 parallel hits to CloudFront. At the same time such that when the 2nd request reaches CloudFront before the 1st request finishes such that CloudFront does not have the cached response yet. Will it wait for the 1st request to finish and return that? Or both requests will hit the server?
When looking at my NGINX logs, it looks like when parallel calls reaches server, they all reach server. Is there any way to avoid that? Like if a rogue client makes too many request for same path, I was hopping that CloudFront can only make 1 request to server and return that same response. Is that possible?
If the first request is still serving to the client, CloudFront hasn't cached it yet , one the first request is complete which means CloudFront received complete data from Origin and served it to the client, it keeps the response in the cache and served further requests.
If you're making second request before the first request is completed, it'll a MISS from cloudfront and you'll see both requests reached to CloudFront.
However, if you're talking about burst of parallel requests , CloudFront has a way to handle them:
Simultaneous Requests for the Same Object
"When a CloudFront edge location receives a request for an object and either the object isn't currently in the cache or the object has expired, CloudFront immediately sends the request to your origin. If there's a traffic spike—if additional requests for the same object arrive at the edge location before your origin responds to the first request—CloudFront pauses briefly before forwarding additional requests for the object to your origin. Typically, the response to the first request will arrive at the CloudFront edge location before the response to subsequent requests. This brief pause helps to reduce unnecessary load on your origin server. If additional requests are not identical because, for example, you configured CloudFront to cache based on request headers or cookies, CloudFront forwards all of the unique requests to your origin."

HTTP Cache Validation

I read Http spec. but I have a doubt and I hope someone can help me.
When a cache receives a request and has a stored response that must be validated (before being served to the received request), does the cache send the received request (adding the conditional header fields it needs for validation) to the next server OR does the cache generate a new request (with conditional header fields it needs for validation) and send the generated request to the next server?
Thank you very much! :)
I think the idea is that the client would issue the request with the key headers, and the server would either respond with the content or a 304 to use whatever was in the local cache.
This behavior should be the same for upstream caches along the network path all the way to the source of truth.
"When a cache receives a request..."
Cache doesn't receive HTTP request. It is user-agent (browser) that check cache to see whether there is any cache entry matched for an HTTP request. Cache itself is just a bunch of data stored in disk/memory.
"Does the cache send the received request...OR does the cache generate a new request..."
Cache doesn't send HTTP request. It is user-agent (browser)'s job to send the request.
In summary, cache is just bytes of data, it doesn't know when and where HTTP request is sent. All cache validation logic (cache related HTTP headers) is implemented by user-agent.

Long polling using rxjs

I want to create long polling client for a web service using RxJS.
Targeted endpoint that supports blocking requests sends x-index header to a client with a value representing current state of the endpoint. This x-index value is being sent in subsequent requests as a query parameter so the endpoint responds only when x-index changes or when request timeouts.
--> client sends first request to the server
<-- server immediately responds with a x-index header
--> client sends blocking request with a value of x-index as a paramter
<-- request is suspended until change of the state or until timeout, then server sends the response
if x-index is changed then pass data to subscriber && repeat from step 3.
I don't know how to create that loop of server requests with changing x-index parameter. Anybody can help, please?

How does validation work in case of a browser cache, proxy cache and an origin server?

See: https://developer.mozilla.org/en-US/docs/Web/HTTP/Caching#Freshness
when the cache receives a request for a stale resource, it forwards this request with a If-None-Match to check if it is in fact still fresh. If so, the server returns a 304 (Not Modified) header without sending the body of the requested resource, saving some bandwidth.
Let's assume we have: a browser cache, proxy cache and an origin server:
The browser cache contains a stored stale resource with entity-tag "A".
The proxy cache contains a stored stale resource with entity-tag "B". The proxy cache can act as a client, and as a server.
This can for example be the case if you're just starting to use a proxy cache. What will happen in this case?
The browser will send a conditional request with If-None-Match: "A".
The proxy cache receives the conditional request.
The proxy cache will forward this request (according to the quote above). This is because the stored resource in proxy cache is stale.
The origin server receives the request with the entity-tag "A".
Let's say, the resource on the origin server contains entity-tag "A". Now the server will respond with a 304 Not Modified response.
At this point, I don't understand things anymore, so maybe I misunderstood something before? The 304 response is okay for the browser cache, because it contains the same resource as on the origin server (same entity-tag). However, the proxy cache contains an older resource (with a different Etag). If the proxy cache would receive the 304 response (and would update its metadata), then the proxy cache makes a resource valid again while it's an old resource.
This is not desirable, so probably I made a mistake somewhere? How does it actually work? How I have to see this process?
Have a look at the section 4.3 of the RFC7234 spec. Section 4.3.2 in particular says the following:
When a cache decides to revalidate its own stored responses for a
request that contains an If-None-Match list of entity-tags, the cache
MAY combine the received list with a list of entity-tags from its own
stored set of responses (fresh or stale) and send the union of the
two lists as a replacement If-None-Match header field value in the
forwarded request. If a stored response contains only partial
content, the cache MUST NOT include its entity-tag in the union
unless the request is for a range that would be fully satisfied by
that partial stored response. If the response to the forwarded
request is 304 (Not Modified) and has an ETag header field value with
an entity-tag that is not in the client's list, the cache MUST
generate a 200 (OK) response for the client by reusing its
corresponding stored response, as updated by the 304 response
metadata (Section 4.3.4).
So the proxy can send both entity tags (A and B) to the origin server for validation. If the resource representation hasn't changed, the origin server will send a 304 response. If the entity tag in that response is B, the proxy can freshen its stale, stored response and use it to send a 200 OK response to the client. Upon receiving this new response, the browser can update its cache with it.
Now, in the scenario you have specified, the 304 NOT MODIFIED response contains the entity tag A (can such a scenario even occur, given that you are accessing the resources through the proxy?). The spec doesn't seem to address this specific case explicitly, but I guess you can just forward the 304 NOT MODIFIED response to the browser. Upon receiving it, the browser can freshen the stale response using its meta data.

Does Cloudfront keep serving the cached response after receiving an error from the origin on updating the call?

We have setup Cloudfront with our own server as its origin and have a json call that is cached for 60 seconds (max-age), and say we successfully cache the response in Cloudfront. Now what happens when Cloudfront tries to update the json response after 60 seconds by calling our server and our server responds with an error (or no response at all when it is down). Does it keep on serving the old response, or return an error?
According to the api docs:
If your origin server is unavailable and CloudFront gets a request for an object that is in the edge cache
but that has expired (for example, because the period of time specified in the Cache-Control max-age
directive has passed), CloudFront continues to serve the expired version of the object. For more information
about object expiration

Resources