Disable caching in open-uri - ruby

I have to, sadly, poll an endpoint and update another system when the data changes. I wrote a loop (with a sleep statement so I don’t DOS the server):
require 'nokogiri'
require 'open-uri'
desired_data = 'foo'
data = nil
url = nil
while data != desired_data do
sleep(2)
url = "https://elections.wi.gov/index.php/elections-voting/statistics"
doc = Nokogiri::HTML.parse(open(url))
puts doc
# do some nokogiri stuff to extract the information I want.
# store information to `data` variable.
end
# if control is here it means the data changed
This works fine except when the server updates, open(url) still returns the old content (even if I restart the script).
It seems like there may be some caching at play. How do I disable it?
Here are the HTTP headers returned:
HTTP/2 200
date: Fri, 02 Oct 2020 14:00:44 GMT
content-type: text/html; charset=UTF-8
set-cookie: __cfduid=dd8fca84d468814dd199dfc08d45c98831601647244; expires=Sun, 01-Nov-20 14:00:44 GMT; path=/; domain=.elections.wi.gov; HttpOnly; SameSite=Lax; Secure
x-powered-by: PHP/7.2.24
cache-control: max-age=3600, public
x-drupal-dynamic-cache: MISS
link: <https://elections.wi.gov/index.php/elections-voting/statistics>; rel="canonical"
x-ua-compatible: IE=edge
content-language: en
x-content-type-options: nosniff
x-frame-options: SAMEORIGIN
expires: Sun, 19 Nov 1978 05:00:00 GMT
last-modified: Fri, 02 Oct 2020 12:47:38 GMT
vary: Cookie
x-generator: Drupal 8 (https://www.drupal.org)
x-drupal-cache: HIT
x-speed-cache: HIT
x-speed-cache-key: /index.php/elections-voting/statistics
x-nocache: Cache
x-this-proto: https
x-server-name: elections.wi.gov
access-control-allow-origin: *
x-xss-protection: 1; mode=block
cache-control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
cf-cache-status: DYNAMIC
cf-request-id: 058b368b9f00002ff234177200000001
expect-ct: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
server: cloudflare
cf-ray: 5dbef38c3b6a2ff2-ORD```
If it matters, I’m using Ruby 2.7 on macOS Big Sur.

It might be a problem on the Drupal 8 website itself as it has its own cache manager - and it seems like there's a cache per user somewhere if you have new content using a web browser.
It is easy to see which cache contexts a certain page varies by and which cache tags it is invalidated by: one must only look at the X-Drupal-Cache-Contexts and X-Drupal-Cache-Tags headers!
But those headers are not available in your list. If you're in touch with the website's developers ask them to do the following:
You can debug cacheable responses (responses that implement this interface, which may be cached by Page Cache or Dynamic Page Cache) by setting the http.response.debug_cacheability_headers container parameter to true, in your services.yml. Followed by a container rebuild, which is necessary when changing a container parameter.
That will cause Drupal to send X-Drupal-Cache-Tags, X-Drupal-Cache-Contexts headers.

Related

Caching - Fix browsers locally caching pages

I'm a bit unsure what it happening here but ill try to explain what is happening and maybe write a better question once i figured out what i'm actually asking.
I have just installed Varnish which seems awesome for my request times. It is a Magneto 2 store which I have followed the default configuration within dev docs for varnish.
My Issue
Currently my issue is that the browser seems to be caching the page until i click refresh. I believe i am successfully flushing / purging the cache with magento / varnish. As when using Curl to request the page i can see a new page is generated each time i flush cache and just serves cached page if i don't.
Within chrome and firefox however on my client pc however the whole page markup seems to be cached (when clicking a link to page or pasting url in browser) until clicking refresh which seems to reload the real page. When deploying new static files etc as the old resources are still in the cached markup and the new location for resources is signed e.g. version1234/styles.css and not matching the markup i get CSS less pages until client clicks refresh and loads the actual markup from server?
How can i setup caching so that this does not happen?
Curl -IL result of URL:
HTTP/1.1 200 OK
Date: Fri, 24 Nov 2017 12:08:32 GMT
Strict-Transport-Security: max-age=63072000; includeSubdomains
X-Frame-Options: DENY
X-Content-Type-Options: nosniff
Expires: Sun, 26 Nov 2017 15:55:17 GMT
Cache-Control: max-age=186400, public, s-maxage=186400
Pragma: cache
X-Content-Type-Options: nosniff
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Vary: Accept-Encoding
X-UA-Compatible: IE=edge
Content-Type: text/html; charset=UTF-8
X-Magento-Cache-Control: max-age=186400, public, s-maxage=186400
X-Magento-Cache-Debug: HIT
Grace: none
age: 0
Accept-Ranges: bytes
Connection: keep-alive
Browser caching takes please because of these headers being sent:
Expires: Sun, 26 Nov 2017 15:55:17 GMT
Cache-Control: max-age=186400, public, s-maxage=186400
You should adjust your server configuration so that those are not sent for PHP. Most likely you have a configuration block in nginx or .htaccess that applies to the whole website, as opposed to just static files.

Why Google Api fonts last-modified keep changing?

When I make a request to a Google Api font (e.g. https://fonts.googleapis.com/css?family=Roboto:400), the last-modified header is always changing to the current time.
Caching doesn't work as a result and the file has to be downloaded every load. Is there a reason for this? Should I download the file and host it on my server?
Should I download the file and host it on my server?
Absolutely not, because the content of the CSS files is dynamic and has different content for every user agent. This is because not all browsers support all font formats. Some require WOFF/WOFF2, others require EOT, TTF or SVG. By downloading and serving the file statically you will break font support for all other browsers.
Interestingly though, I do not see a last-modified header at all:
HTTP/1.1 200 OK
Access-Control-Allow-Origin: *
Content-Type: text/css
Alt-Svc: clear
Alternate-Protocol: 443:quic,p=0
X-XSS-Protection: 1; mode=block
Server: GSE
Expires: Mon, 14 Dec 2015 09:14:21 GMT
Timing-Allow-Origin: *
Cache-Control: private, max-age=86400
Date: Mon, 14 Dec 2015 09:14:21 GMT
Content-Length: 222
Connection: close
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
The Expires header indeed is the same as the Date - so it expires at the time the ressource was loaded. The max-age option of Cache-Control though has higher precedence. So the file should be cached by the browser for 1 day.

Disable caching of content in firefox offline mode

I am working on a web application which has user management in place. I find a concerning issue in firefox related to Work Offline. Following are the steps describing the scenario:
User logs in to the application
User performs some action and logs out of the application
If the user now enables Work Offline mode in firefox, he/she can use browser back to access the last page. However, this page is supposed to be secure.
In my opinion this is a data security issue as any other user can apply this technique to fetch valuable information of the last user.
I have used cache control headers to communicate to the browser that HTML content should not be cached. Following are the response headers used:
HTTP/1.1 200 OK
Date: Tue, 05 May 2015 10:39:30 GMT
Server: Apache/2.4.9 (Unix) OpenSSL/0.9.8za
Cache-Control: no-cache, no-store
Expires: Wed, 31 Dec 1969 23:59:59 GMT
Content-Type: text/html;charset=UTF-8
Content-Language: en
Vary: Accept-Encoding
Content-Encoding: gzip
X-Frame-Options: SAMEORIGIN
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Transfer-Encoding: chunked
I have used
Cache-Control: no-cache, no-store
Expires: Wed, 31 Dec 1969 23:59:59 GMT
I have noted this vulnerability in applications like Facebook. Is this resolvable? Thank you.

Images loading from akamai not caching in the browser

Images loading from akamai not caching in the browser.
when looking through developer window i see this in the header:
Accept-Ranges bytes
Access-Control-Allow-Cred... true
Access-Control-Allow-Orig... *
Access-Control-Request-He... X-Requested-With,Content-Type,Accept,Origin
*
Cache-Control max-age=0, no-cache, no-store
*
Connection Keep-Alive
Content-Length 114069
Content-Type image/jpeg
Date Tue, 02 Jul 2013 14:20:52 GMT
Etag "01bd6c5172ce1:0"
Expires Tue, 02 Jul 2013 14:20:52 GMT
Last-Modified Wed, 26 Jun 2013 00:11:58 GMT
Pragma no-cache
Server Microsoft-IIS/7.5
Set-Cookie BNI__BARRACUDA_LB_COOKIE=00000000000000000000000097f7ab4200005000; Path=/;
HttpOnly
*X-CFLO-Cache-Result* TCP_NC_MISS
X-Powered-By ASP.NET
What can i do to forces Akamai Servers to change the images header so it can be cached in the browser.
Generally Akamai preserve Cache-Control header. If you don't setup Cache-Control your application/web server, is very recommendation that you do. But, if you want create specific policies using Akamai, you will need to enable the "Set Browser Cache Control Headers".

Double Set-Cookie in Magento, leading to a login issue for some users

We have a Magento application which is issuing dual Set-Cookie's . Here are the headers:
HTTP/1.1 200 OK
Date: Wed, 18 Apr 2012 21:04:28 GMT
Server: Apache/2.2.3 (CentOS)
X-Powered-By: PHP/5.2.10
Set-Cookie: frontend=iti6c00cdm6cc79hfl1pl9pq52; expires=Wed, 18-Apr-2012 22:04:28 GMT; path=/
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
Set-Cookie: frontend=iti6c00cdm6cc79hfl1pl9pq52; expires=Wed, 18-Apr-2012 22:04:28 GMT; path=/; domain=**example.com**
Connection: close
Transfer-Encoding: chunked
Content-Type: text/html; charset=UTF-8
In some circumstances, after logging in the second cookie is set to frontend=deleted . From my reading it appears that two frontend= cookies are not a "problem", this is standard Magento behaviour. From my reading of the spec, the second frontend= cookie will overwrite the first if their scope/spec is the same.
Any ideas where we can start digging in to this problem to see why the second frontend= cookie does not behave like the first?
Magento version is enterprise edition of ver. 1.9.0.0
Related Questions
Why does Magento use 2 cookies per session?
Magento Cookies Changing Prevent Frontend Login
This happens when the Session validation checks fail - the cookie will then be cleared with the "deleted" value and a expiration date in the past:
The following information will be checked by Magento for validating a session:
The client IP address that is connecting to the server
The "Via" HTTP-Header
The "X-Forwarded-For" Header
The "User-Agent" Header
If one (or more) of these informations changes during the requests for the same Session ID, the session will be Discarted, the Cookie will be cleared in the way as described and the Server will send a Redirect header to the Homepage.
You can change which Information to validate in the Magento Admin-Panel by going to System > Configuration > Web. But you should never turn off all checks since this will allow session hijacking.
Do you want to override fronten cookie... if so better try to first destroy the cookie and then reset it by using Magento method
Mage::getModel('core/cookie')->set('frontend', $session->getCustomer()->getId(), 100000*24*3600);

Resources