NGINX proxy_pass not caching content - caching

I'm having issues getting NGINX to cache thumbnails that I'm pulling from Dropbox using the proxy_pass command. On the same server that NGINX is running I run the following command multiple times
wget --server-response --spider http://localhost:8181/1/thumbnails/auto/test.jpg?access_token=123
and get the exact same response with X-Cache: MISS every time
HTTP/1.1 200 OK
Server: nginx/1.1.19
Date: Wed, 25 Mar 2015 20:05:36 GMT
Content-Type: image/jpeg
Content-Length: 1691
Connection: keep-alive
pragma: no-cache
cache-control: no-cache
X-Robots-Tag: noindex, nofollow, noimageindex
X-Cache: MISS
Here's my meat of my nginx.conf file .. any ideas on what I'm doing wrong here?
## Proxy Server Caching
proxy_cache_path /data/nginx/cache keys_zone=STATIC:10m max_size=1g;
## Proxy Server Setting
server {
listen *:8181;
proxy_cache STATIC;
proxy_cache_key "$request_uri";
proxy_cache_use_stale error timeout invalid_header updating
http_500 http_502 http_503 http_504;
location ~ ^/(.*) {
set $dropbox_api 'api-content.dropbox.com';
set $url '$1';
resolver 8.8.8.8;
proxy_set_header Host $dropbox_api;
proxy_cache STATIC;
proxy_cache_key "$request_uri";
proxy_cache_use_stale error timeout invalid_header updating
http_500 http_502 http_503 http_504;
add_header X-Cache $upstream_cache_status;
proxy_pass https://$dropbox_api/$url$is_args$args;
}
##Error Handling
error_page 500 502 503 504 404 /error/;
location = /error/ {
default_type text/html;
}
}

Turns out that thumbnail requests returned from Dropbox include the header
Cache-Control: no-cache
and Nginx will adhere to these headers unless they are explicitly ignored which can be done by simply using the following config line that will ignore any caching control.
proxy_ignore_headers X-Accel-Expires Expires Cache-Control;
We also had issues placing the "proxy_ignore_headers" option in different areas within the nginx.conf file. Finally after much playing around we got it to work by explicitly setting it in the "location" block. The full snippet of the config file can be found below
## Proxy Server Caching
proxy_cache_path /data/nginx/cache levels=1:2 keys_zone=STATIC:50m inactive=2h max_size=2g;
## Proxy Server Setting
server {
listen *:8181;
location ~ ^/(.*) {
set $dropbox_api 'api-content.dropbox.com';
set $url '$1';
resolver 8.8.8.8;
proxy_set_header Host $dropbox_api;
proxy_hide_header x-dropbox-thumbcachehit;
proxy_hide_header x-dropbox-metadata;
proxy_hide_header x-server-response-time;
proxy_hide_header x-dropbox-request-id;
proxy_hide_header cache-control;
proxy_hide_header expires;
add_header cache-control "private";
add_header x-cache $upstream_cache_status; # HIT / MISS / BYPASS / EXPIRED
proxy_cache STATIC;
proxy_cache_valid 200 1d;
proxy_cache_use_stale error timeout invalid_header updating
http_500 http_502 http_503 http_504;
proxy_ignore_headers X-Accel-Expires Expires Cache-Control;
proxy_pass https://$dropbox_api/$url$is_args$args;
}
}

In order to cache the proxy response the request between Nginx and origin should be cookie-less:
proxy_hide_header Set-Cookie;
proxy_ignore_headers Set-Cookie;
See full configuration with invalidation methods: https://gist.github.com/mikhailov/9639593

If above answers didn't solved your issue, try this:
proxy_cache_valid 200 2d;
(or whatever amount of time and whatever response code you want)
add this where you are using or activating your proxy_cache <keys_zone_name>.
Apparently for me as soon as I remove proxy_cache_valid parameter caching status doesn't show up. Also documentation doesn't says that this is required field. Let me know if this works for you. So we might update documentation.
I expect expect proxy_cache get started page should show that you need 3 at least these parameters to get started: proxy_cache_path, proxy_cache and proxy_cache_valid

Related

Nginx not caching when Vary headers not being ignored

First off: I don't have much experience with Nginx.
I'll just proceed directly to the problem though:
Nginx config:
user www-data;
worker_processes auto;
pid /run/nginx.pid;
events {
worker_connections 2048;
multi_accept on;
}
http {
proxy_cache_path /var/nginx_cache levels=1:2 keys_zone=STATIC:10m inactive=24h max_size=10g;
upstream server {
server -removed-;
}
server {
listen 80;
server_name -removed-;
location / {
gzip on;
gzip_disable "MSIE [1-6]\.(?!.*SV1)";
gzip_http_version 1.1;
gzip_min_length 500;
gzip_vary on;
gzip_proxied any;
gzip_types
application/atom+xml
application/javascript
application/json
application/ld+json
application/manifest+json
application/rss+xml
application/vnd.geo+json
application/vnd.ms-fontobject
application/x-font-ttf
application/x-web-app-manifest+json
application/xhtml+xml
application/xml
font/opentype
image/bmp
image/svg+xml
image/x-icon
text/cache-manifest
text/css
text/plain
text/vcard
text/vnd.rim.location.xloc
text/vtt
text/x-component
text/x-cross-domain-policy
text/js
text/xml
text/javascript;
add_header X-Cache-Status $upstream_cache_status;
proxy_cache STATIC;
proxy_set_header Host $host;
----> proxy_ignore_headers Vary; <-----
proxy_cache_key $host$uri$is_args$args;
proxy_cache_use_stale error timeout http_500 http_502 http_503 http_504;
proxy_pass -removed-;
}
}
}
When the line 'proxy_ignore_headers Vary;' is set, everything will cache, including the HTML pages. When I remove this line, everything gets cached EXCEPT the HTML pages. Why is this?
I would like that Nginx caches the HTML pages even when Vary-headers are being sent by the origin server.
I hope someone can help me :).
Response Headers are:
Vary:Host, Content-Language, Content-Type, Content-Encoding
Fixed:
In the source code of Nginx there is set a maximum of 42 characters being used by Vary headers. In my case there where 51 characters thus my Vary headers where being handled as Vary:* (no-cache). Setting the maximum to 84 fixed it for me.
This article explains it more in depth.
https://thedotproduct.org/nginx-vary-header-handling/
Credits to the guy posting that short article.

Caching images on all folder levels of nginx reverse proxy

I'm trying to get my head around caching images for my open source image hosting serivce PictShare.
Pictshare has a smart query system where an uploaded image can be in a "virtual subdirectory" that changes the image. For example this is the link to the uploaded stackoverflow logo: https://www.pictshare.net/6cb55fe938.png
I can resize it to 300 width by adding /300/ to the URL before the image name: https://www.pictshare.net/300/6cb55fe938.png
Since I'm dealing with a lot of traffic lately I want my nginx proxy to be able to cache all images in from all virtual sub folders but it's not working. I've read many articles and many stackoverflow posts but no solution worked for me.
So far this my productive vhost file
proxy_cache_path /etc/nginx/cache/pictshare levels=1:2 keys_zone=pictshare:50m max_size=1000m inactive=30d;
proxy_temp_path /etc/nginx/tmp 1 2;
proxy_cache_key "$scheme$request_method$host$request_uri";
proxy_ignore_headers "Set-Cookie";
proxy_hide_header "Set-Cookie";
proxy_buffering on;
server {
...
location / {
proxy_pass http://otherserver/pictshare/;
include /etc/nginx/proxy_params;
location ~* \.(?:jpg|jpeg|gif|png|ico)$ {
expires max;
proxy_cache_use_stale error timeout http_500 http_502 http_503 http_504;
proxy_cache_valid 200 301 302 1y;
proxy_cache pictshare;
proxy_pass http://otherserver/pictshare$request_uri;
}
}
}
The problem is that no files are cached and I see every image request on the proxy destination.
The only way I got it to work was by adding a special location to the host file that has caching explicitly enabled:
location /cached {
proxy_cache_valid 200 1y;
proxy_cache pictshare;
expires 1y;
proxy_cache_use_stale error timeout http_500 http_502 http_503 http_504;
include /etc/nginx/proxy_params;
proxy_pass http://otherserver/pictshare/thumbs/;
proxy_cache_use_stale error timeout invalid_header updating http_500 http_502 http_503 http_504;
}
The obvious problem with this solution is that the images are only cached when the request starts with /cached eg: https://www.pictshare.net/cached/6cb55fe938.png
Adding the caching commands to the root directory is no option for me since I don't want the forms and pages to be cached, just the images
Where is my mistake?
proxy_cache_path /etc/nginx/cache/pictshare levels=1:2 keys_zone=my_cache:50m max_size=3g inactive=180m;
proxy_temp_path /etc/nginx/tmp 1 2;
proxy_cache_key "$scheme$request_method$host$request_uri";
location ~* ^.+\.(jpe?g|gif|png|ico|pdf)$ {
access_log off;
include /etc/nginx/proxy.conf;
proxy_pass http://backend;
proxy_cache pictshare;
proxy_cache_valid any 12h;
add_header X-Proxy-Cache $upstream_cache_status;
root /var/www/public_html/cached; }
location / {
include /etc/nginx/proxy.conf;
proxy_pass http://backend;
root /var/www/public_html;
}
nginx first searches for the most specific prefix location given by literal strings regardless of the listed order. In the configuration above the only prefix location is “/” and since it matches any request it will be used as a last resort. Then nginx checks locations given by regular expression in the order listed in the configuration file. The first matching expression stops the search and nginx will use this location. If no regular expression matches a request, then nginx uses the most specific prefix location found earlier.
http://nginx.org/en/docs/http/request_processing.html

How to enable nginx proxy caching for gunicorn mezzanine

Our stack is nginx - gunicorn - mezzanine (django cms) running on an EC2 instance. Everything works, but I can't seem to enable nginx proxy_cache. Here is my minimal config:
upstream %(proj_name)s {
server 127.0.0.1:%(gunicorn_port)s;
}
proxy_cache_path /cache keys_zone=bravo:10m;
server {
listen 80;
listen 443;
server_name %(live_host)s;
client_max_body_size 100M;
keepalive_timeout 15;
location /static/ {
expires 1M;
access_log off;
add_header Cache-Control "public";
root %(proj_path)s;
}
location / {
expires 1M;
add_header X-Proxy-Cache $upstream_cache_status;
proxy_cache bravo;
proxy_ignore_headers Cache-Control Expires Set-Cookie;
proxy_redirect off;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-Protocol $scheme;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_pass http://%(proj_name)s;
}
}
Sample response:
HTTP/1.1 200 OK
Server: nginx/1.4.6 (Ubuntu)
Date: Wed, 07 Jan 2015 03:43:47 GMT
Content-Type: text/html; charset=utf-8
Transfer-Encoding: chunked
Connection: keep-alive
Vary: Accept-Encoding
Vary: Cookie, Accept-Language
Content-Language: en
Expires: Fri, 06 Feb 2015 03:43:47 GMT
Cache-Control: max-age=2592000
X-Proxy-Cache: MISS
Content-Encoding: gzip
I have mezzanine cache middleware enabled and it is returning responses with Set-Cookie headers, but proxy_ignore_headers should ignore that.
I did chmod 777 on proxy_cache_path dir (/cache) so it shouldn't be a permissions issue.
Error logging is enabled but has produced nothing.
proxy_cache_path continues to remain completely empty...
Why is nginx not caching anything with this config?

nginx is not caching proxied responses on disk, even though I ask it to

I'm using nginx as a load-balancing proxy, and I would also like it to cache its responses on disk so it doesn't have to hit the upstream servers as often.
I tried following the instructions at http://wiki.nginx.org/ReverseProxyCachingExample. I'm using nginx 1.7 as provided by Docker.
Here's my nginx.conf (which gets installed into nginx/conf.d/):
upstream balancer53 {
server conceptnet-api-1:10053;
server conceptnet-api-2:10053;
}
proxy_cache_path /data/nginx/cache levels=1:2 keys_zone=STATIC:1g max_size=1g;
server {
listen 80;
gzip on;
gzip_proxied any;
gzip_types application/json;
charset utf-8;
charset_types application/json;
location /web {
proxy_pass http://balancer53;
proxy_set_header X-Remote-Addr $proxy_add_x_forwarded_for;
proxy_cache STATIC;
proxy_cache_valid 200 1d;
proxy_cache_use_stale error timeout http_500 http_502 http_503 http_504;
proxy_ignore_headers X-Accel-Expires Expires Cache-Control X-RateLimit-Limit X-RateLimit-Remaining X-RateLimit-Reset;
}
location /data/5.3 {
proxy_pass http://balancer53;
proxy_set_header X-Remote-Addr $proxy_add_x_forwarded_for;
proxy_cache STATIC;
proxy_cache_valid 200 1d;
proxy_cache_use_stale error timeout http_500 http_502 http_503 http_504;
proxy_ignore_headers X-Accel-Expires Expires Cache-Control X-RateLimit-Limit X-RateLimit-Remaining X-RateLimit-Reset;
}
location /data/5.2 {
# serve the old version
proxy_pass http://conceptnet52:10052/;
proxy_set_header X-Remote-Addr $proxy_add_x_forwarded_for;
proxy_cache STATIC;
proxy_cache_valid 200 1d;
proxy_cache_use_stale error timeout http_500 http_502 http_503 http_504;
proxy_ignore_headers X-Accel-Expires Expires Cache-Control X-RateLimit-Limit X-RateLimit-Remaining X-RateLimit-Reset;
}
location / {
root /var/www;
index index.html;
autoindex on;
rewrite ^/static/(.*)$ /$1;
}
}
Despite this configuration, nothing ever shows up in /data/nginx/cache.
Here's an example of the response headers from the upstream server:
$ curl -vs http://localhost:10053/data/5.3/assoc/c/en/test > /dev/null
* Hostname was NOT found in DNS cache
* Trying ::1...
* Connected to localhost (::1) port 10053 (#0)
> GET /data/5.3/assoc/c/en/test HTTP/1.1
> User-Agent: curl/7.35.0
> Host: localhost:10053
> Accept: */*
>
< HTTP/1.1 200 OK
* Server gunicorn/19.1.1 is not blacklisted
< Server: gunicorn/19.1.1
< Date: Thu, 06 Nov 2014 20:54:52 GMT
< Connection: close
< Content-Type: application/json
< Content-Length: 1329
< Access-Control-Allow-Origin: *
< X-RateLimit-Limit: 60
< X-RateLimit-Remaining: 59
< X-RateLimit-Reset: 1415307351
<
{ [data not shown]
* Closing connection 0
Each upstream server is enforcing a rate limit, but I am okay with disregarding the rate limit on cached responses. I was unsure whether these headers were preventing caching, which is why I told nginx to ignore them.
What do I need to do to get nginx to start using the cache?
Official documentation tells If the header includes the “Set-Cookie” field, such a response will not be cached. Please check it out here.
To make cache working use hide and ignore technique:
location /web {
...
proxy_hide_header Set-Cookie;
proxy_ignore_headers Set-Cookie;
}
I tried running nginx alone with that nginx.conf, and found that it complained about some of the options being invalid. I think I was never successfully building a new nginx container at all.
In particular, it turns out you don't just put any old headers in the proxy_ignore_headers option. It only takes particular headers as arguments, ones that the proxy system cares about.
Here is my revised nginx.conf, which worked:
upstream balancer53 {
server conceptnet-api-1:10053;
server conceptnet-api-2:10053;
}
proxy_cache_path /data/nginx/cache levels=1:2 keys_zone=STATIC:100m max_size=100m;
server {
listen 80;
gzip on;
gzip_proxied any;
gzip_types application/json;
charset utf-8;
charset_types application/json;
location /web {
proxy_pass http://balancer53;
proxy_set_header X-Remote-Addr $proxy_add_x_forwarded_for;
proxy_cache STATIC;
proxy_cache_valid 200 1d;
proxy_cache_use_stale error timeout http_500 http_502 http_503 http_504;
proxy_ignore_headers X-Accel-Expires Expires Cache-Control;
}
location /data/5.3 {
proxy_pass http://balancer53;
proxy_set_header X-Remote-Addr $proxy_add_x_forwarded_for;
proxy_cache STATIC;
proxy_cache_valid 200 1d;
proxy_cache_use_stale error timeout http_500 http_502 http_503 http_504;
proxy_ignore_headers X-Accel-Expires Expires Cache-Control;
}
location / {
root /var/www;
index index.html;
autoindex on;
rewrite ^/static/(.*)$ /$1;
}
}

Nginx Cache doesn't refresh

I configured nginx to cache all files for 3min. This works only for files I upload to the webserver manually. All files generated by the CMS get cached forever (or a long time I didn't wait)...
The CMS delivers all pages as "index.html" with an own folder structure (www.x.de/category1/category2/articlename/index.html).
How can I debug this? Is there a way to check the lifetime of a specific file?
Can something in the .html files overwrite the proxy_cache_valid value?
Many thanks!
Config:
server {
listen 1.2.3.4:80 default_server;
server_name x.de;
server_name www.x.de;
server_name ipv4.x.de;
client_max_body_size 128m;
location / { # IPv6 isn't supported in proxy_pass yet.
proxy_pass http://apache.ip:7080;
proxy_cache my-cache;
proxy_cache_valid 200 3m;
proxy_cache_valid 404 1m;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Accel-Internal /internal-nginx-static-location;
access_log off;
}
location /internal-nginx-static-location/ {
alias /var/www/vhosts/x.de/httpdocs/cms/;
access_log /var/www/vhosts/x.de/statistics/logs/proxy_access_log;
add_header X-Powered-By PleskLin;
internal;
}}
Using curl -I, you can retrieve the headers which will tell you what the cache settings are.
E.g.
>>> curl -I http://www.google.com
HTTP/1.1 200 OK
Date: Sun, 09 Feb 2014 06:28:36 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
Server: gws
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Transfer-Encoding: chunked
Cache settings are done in the response headers, so it's not possible for html to modify those.

Resources