Nginx performance problems - performance

I am having performance problems with my website. My configuration is 1G VPS with wordpress/nginx/php-fpm on ubuntu 11.04. The bottleneck is when the browser is waiting for first byte from the server. It takes 4-6 secs just waiting for first response from the server after initiating the connection (The website is new and it receives very low traffic currenlty , about 50-150 visit/day). Following are my nginx conf, I hope it may help understanding where the problem is. I want to know if there is something wrong with this configuration that may be optimized. Also if anyone can recommend me profiling/analysis tools to use that suits my configuration.
Note: I replaced my username with myusername, my domain with mydomain.com
nginx.conf
user myusername;
worker_processes 4;
pid /var/run/nginx.pid;
events {
worker_connections 768;
# multi_accept on;
}
http {
index index.php index.html index.htm;
sendfile on;
# tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 5;
types_hash_max_size 2048;
# server_tokens off;
# server_names_hash_bucket_size 64;
# server_name_in_redirect off;
include /etc/nginx/mime.types;
default_type application/octet-stream;
client_max_body_size 50m;
access_log /var/log/nginx/access.log;
error_log /var/log/nginx/error.log;
gzip on;
gzip_disable "msie6";
# gzip_vary on;
# gzip_proxied any;
# gzip_comp_level 6;
# gzip_buffers 16 8k;
# gzip_http_version 1.1;
# gzip_types text/plain text/css application/json application/x-javascript text/xml application/xml application/xml+rss text/javascript;
##
# Virtual Host Configs
##
include /etc/nginx/conf.d/*.conf;
include /etc/nginx/sites-enabled/*;
}
sites-enabled/default
server {
listen 80; ## listen for ipv4; this line is default and implied
listen [::]:80 default ipv6only=on; ## listen for ipv6
root /home/myusername/www;
# Make site accessible from http://localhost/
server_name mydomain.com;
location / {
# First attempt to serve request as file, then
# as directory, then fall back to index.html
try_files $uri $uri/ /index.php;
}
location /doc {
# root /usr/share;
autoindex on;
allow 127.0.0.1;
deny all;
}
location /images {
# root /usr/share;
autoindex off;
}
error_page 404 = #wordpress;
log_not_found off;
location #wordpress {
fastcgi_pass 127.0.0.1:9000;
fastcgi_param SCRIPT_FILENAME $document_root/index.php;
include /etc/nginx/fastcgi_params;
fastcgi_param SCRIPT_NAME /index.php;
}
location ^~ /files/ {
rewrite /files/(.+) /wp-includes/ms-files.php?file=$1 last;
}
# redirect server error pages to the static page /50x.html
#
#error_page 500 502 503 504 /50x.html;
#location = /50x.html {
# root /usr/share/nginx/www;
#}
# proxy the PHP scripts to Apache listening on 127.0.0.1:80
#
#location ~ \.php$ {
# proxy_pass http://127.0.0.1;
#}
# pass the PHP scripts to FastCGI server listening on 127.0.0.1:9000
#
location ~ \.php$ {
try_files $uri #wordpress;
fastcgi_index index.php;
fastcgi_pass 127.0.0.1:9000;
fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
include /etc/nginx/fastcgi_params;
}
# deny access to .htaccess files, if Apache's document root
# concurs with nginx's one
#
location ~ /\.ht {
deny all;
}
location ^~ /blogs.dir/ {
internal;
root /home/myusername/www/wp-content;
}
}

Looks like a Wordpress site, I'd lean more towards it being a performance problem there than with the nginx config itself.
Some recommendations:
1 - Make sure you have APC installed and enabled
2 - Install a server-side caching plugin (W3 Total Cache or Supercache) and configure it to use APC as a backing store (and turn on all of the caching layers)
As far as profilers go, I'm a huge fan of NewRelic and their Pro level is free for the first 2 weeks (usually long enough to find the hot spots) with basic performance information free forever.

It seems your nginx configuration indeed has a lot room for improvements. Nginx is already very efficient with how it utilizes CPU and Memory. However, we can tweak several parameters based on the type of workload that we plan to serve. If we are primarily serving static files, we expect our workload profile to be less CPU intensive and more disk-process oriented. Actually your nginx.conf shouldn't a problem as long as the very nature of nginx is geared toward maximum performance, but as you stated, you're not getting nginx good performance at all.
I also run a 1GB - 1 core VPS running a fresh LEMP install (Ubuntu 14.04, nginx, MySQL, php5-fpm and nothing else one would consider memory consuming such as cPanel, Zpanel and alikes, no phpMyAdmin as well as (I use MySQL Workbench app). So, I've got a WordPress site up and running without any cache plugins or even APC/memcached schemes (still researching for the best approach that will fit my needs), and I always have an excellent performance.
Anyway, the nginx.conf set up below is still a very basic adjustment in order to increase nginx performance. This is a duplicate of the current nginx.conf file I use to serve my website. I'm sharing it with you here just as a reference. You can further tweak it based on your own researches, but I believe you'll certainly notice an overall enhancement after trying it out.
So let's go through it...
TUNING nginx
Determine Nginx worker_processes and worker_connections
We can configure the number of single-threaded Worker processes to be 1.5 to 2x the number of CPU cores to take advantage of Disk bandwidth (IOPs).
Make sure you use the correct amount of worker_processes in your /etc/nginx/nginx.conf. This should be equal to the amount of CPU cores in the output of the command bellow, (execute it on your terminal app):
cat /proc/cpuinfo | grep processor
In my case the result below shows only one processor
root#server1:~# cat /proc/cpuinfo | grep processor
processor : 0
root#server1:~#
So my machine has only 1 processor available, then I set
[...]
worker_processes 1;
[...]
I've comment most of the important parts that should be tweaked, again, you should research and start building your very own configuration that will fit your work/production environment. We are not covering any caching technics or serving the site through a ssl (https) secure connection, just plain basic nginx configuration.
user nginx;
# Set the number of worker processes
# You can also set it to "auto" to let Nginx decide the right number
worker_processes 1;
pid /var/run/nginx.pid;
events {
# Increase worker connections
worker_connections 1024;
# Accept() as many connections as possible
# More information http://wiki.nginx.org/EventsModule
multi_accept on;
# Serve many clients with each thread (Linux)
use epoll;
}
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
# Let NGINX get the real client IP for its access logs
set_real_ip_from 127.0.0.1;
real_ip_header X-Forwarded-For;
##
# Basic Settings
##
# Tweak TCP connections handling
sendfile on;
tcp_nopush on;
tcp_nodelay on;
# Increase keepalive timeout
keepalive_timeout 65;
# Reset timedout connections and free up some memory
reset_timedout_connection on;
# Other buffers/timeouts optimizations
#if you want to allow users to upload files with a bigger size, consider increasing client_max_body_size to whatever fits your needs
client_max_body_size 20m;
client_body_timeout 60;
client_header_timeout 60;
client_body_buffer_size 8K;
client_header_buffer_size 1k;
large_client_header_buffers 4 8k;
send_timeout 60;
reset_timedout_connection on;
types_hash_max_size 2048;
# Hide Nginx version
server_tokens off;
server_names_hash_bucket_size 128;
#server_name_in_redirect off;
include /etc/nginx/mime.types;
default_type application/octet-stream;
##
# Logging Settings
##
# Disable access log to boost I/O on HDD
# Enabling access_log off; on the web server can save a lot of I/O as well as CPU power.
access_log off;
access_log /var/log/nginx/access.log;
error_log /var/log/nginx/error.log;
# Log Format
log_format main '$remote_addr - $remote_user [$time_local] ' '"$request" $status $body_bytes_sent "$http_referer" ' '"$http_user_agent" "$http_x_forwarded_for"';
##
# Gzip Settings
##
# Enable GZIP compression to save bandwidth
gzip on;
gzip_disable "msie6";
gzip_vary on;
gzip_proxied any;
gzip_comp_level 6;
gzip_buffers 16 8k;
gzip_http_version 1.1;
gzip_types text/css text/javascript text/xml text/plain text/x-component application/javascript application/x-javascript application/json application/xml application/rss+xml font/truetype application/x-font-ttf font/opentype application/vnd.ms-fontobject image/svg+xml;
include /etc/nginx/conf.d/*.conf;
include /etc/nginx/sites-enabled/*;
}
I hope it helps you to get started. Good luck.

Related

Downtime when deploying Laravel to Azure

Im deploying a laravel site to a Azure Web App (running linux).
After upgrading to PHP 8 and nginx I experience a lot more downtime after deployment. Several minutes of nginx's Bad Gateway error.
In order to get laravel working with nginx I need to copy a nginx conf file from my project to nginx's config on the server.
Im running startup.sh after deploy that has the following commands as first lines:
cp /home/site/wwwroot/devops/nginx.conf /etc/nginx/sites-available/default;
service nginx reload
Content of my nginx.conf:
server {
# adjusted nginx.conf to make Laravel 8 apps with PHP 8.0 features runnable on Azure App Service
# #see https://laravel.com/docs/8.x/deployment
listen 8080;
listen [::]:8080;
root /home/site/wwwroot/public;
index index.php;
client_max_body_size 100M;
add_header X-Frame-Options "SAMEORIGIN";
add_header X-Content-Type-Options "nosniff";
gzip on;
gzip_proxied any;
gzip_min_length 256;
gzip_types
application/atom+xml
application/geo+json
application/javascript
application/x-javascript
application/json
application/ld+json
application/manifest+json
application/rdf+xml
application/rss+xml
application/xhtml+xml
application/xml
font/eot
font/otf
font/ttf
image/svg+xml
text/css
text/javascript
text/plain
text/xml;
location / {
try_files $uri $uri/ /index.php$is_args$args;
}
location ~ [^/]\.php(/|$) {
fastcgi_split_path_info ^(.+?\.php)(|/.*)$;
fastcgi_pass 127.0.0.1:9000;
include fastcgi_params;
fastcgi_param HTTP_PROXY "";
fastcgi_param SCRIPT_FILENAME $realpath_root$fastcgi_script_name;
fastcgi_param PATH_INFO $fastcgi_path_info;
fastcgi_param QUERY_STRING $query_string;
fastcgi_intercept_errors on;
fastcgi_connect_timeout 300;
fastcgi_send_timeout 3600;
fastcgi_read_timeout 3600;
fastcgi_buffer_size 128k;
fastcgi_buffers 4 256k;
fastcgi_busy_buffers_size 256k;
fastcgi_temp_file_write_size 256k;
}
}
I've also tried to use Azure Deployment Slots but the swap is happening before the Bad Gateway error has gone away.
Is there something else I can do to minimize the downtime/time for the project to get up and running again?
The "Bad Gateway" error suggests that Nginx is unable to connect to the backend, which in this case is PHP-FPM.
There are a few things you can try to minimize the downtime:
Increase the fastcgi_connect_timeout, fastcgi_send_timeout, and fastcgi_read_timeout values in your nginx configuration file. This will give PHP-FPM more time to start up and respond to requests.
Optimize your PHP code. Make sure your code is optimized for performance, as this will help reduce the time it takes for the site to start up.
Use Azure Deployment Slots for testing. Deployment slots allow you to test your code in a staging environment before deploying it to production. This can help reduce the risk of downtime in your production environment.
Try to make sure that your PHP-FPM and nginx services are always running, and that they are started automatically when the server boots up.
Try to reduce the number of restarts needed during deployment by using a deployment process that utilizes rolling upgrades.
Finally, you can try deploying a simple HTML file first, and then deploy the Laravel codebase. This will ensure that the web server and PHP are working before deploying the Laravel codebase.
Use trial and error to find out the best solution for your use case.

GCP Cloud Run returns "Faithfully yours, nginx"

I'm trying to host my laravel application in GCP cloud run and everything works just fine but for some reason whenever I run a POST request with lots of data (100+ rows of data - 64Mb) saving to the database, it always throw an error. I'm using nginx with docker by the way. Please see the details below.
ERROR
Cloud Run Logs
The request has been terminated because it has reached the maximum request timeout.
nginx.conf
worker_processes 1;
events {
worker_connections 1024;
}
http {
include mime.types;
sendfile on;
keepalive_timeout 65;
server {
listen LISTEN_PORT default_server;
server_name _;
root /app/public;
index index.php;
charset utf-8;
location / {
try_files $uri $uri/ /index.php?$query_string;
}
location = /favicon.ico { access_log off; log_not_found off; }
location = /robots.txt { access_log off; log_not_found off; }
access_log /dev/stdout;
error_log /dev/stderr;
sendfile off;
client_max_body_size 100m;
location ~ \.php$ {
fastcgi_split_path_info ^(.+\.php)(/.+)$;
fastcgi_pass 127.0.0.1:9000;
fastcgi_index index.php;
include fastcgi_params;
fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
fastcgi_intercept_errors off;
fastcgi_buffer_size 32k;
fastcgi_buffers 8 32k;
}
error_page 500 502 503 504 /50x.html;
location = /50x.html {
root html;
}
}
#include /etc/nginx/sites-enabled/*;
}
daemon off;
Dockerfile
FROM php:8.0-fpm-alpine
RUN apk add --no-cache nginx wget
RUN docker-php-ext-install mysqli pdo pdo_mysql
RUN mkdir -p /run/nginx
COPY docker/nginx.conf /etc/nginx/nginx.conf
RUN mkdir -p /app
COPY . /app
RUN sh -c "wget http://getcomposer.org/composer.phar && chmod a+x composer.phar && mv composer.phar /usr/local/bin/composer"
RUN cd /app && \
/usr/local/bin/composer install --no-dev
RUN chown -R www-data: /app
CMD sh /app/docker/startup.sh
Laravel version:
v9
Please let me know if you need some data that is not indicated yet on my post.
Increase max_execution_time in php configuration. By default it is only 30 seconds. Make 30 minutes for example:
max_execution_time = 1800
Increase timeouts of nginx:
http{
...
proxy_read_timeout 1800;
proxy_connect_timeout 1800;
proxy_send_timeout 1800;
send_timeout 1800;
keepalive_timeout 1800;
...
}
Another idea for investigation is to give more resources to your cloud instance (more CPUs, more RAM) in order to process your request faster and avoid timeout. But eventually it should be increased.
I think the issue has nothing to do with php, laravel, or nginx, but with Cloud Run.
As you can see in the Google Cloud documentation when they describe HTTP 504: Gateway timeout errors:
HTTP 504
The request has been terminated because it has reached the maximum request
timeout.
If your service is processing long requests, you can increase the request
timeout. If your service doesn't return a response within the time
specified, the request ends and the service returns an HTTP 504 error, as
documented in the container runtime contract.
As suggested in the docs, please, try increasing the request timeout until your application can process the huge POST data you mentioned: it is set by default to 5 minutes, but can be extended up to 60 minutes.
As described in the docs, you can set it through the Google Cloud console and the gcloud CLI; directly, or by modifying the service YAML configuration.
Default Nginx timeout is 60s. Since you have mentioned the data is 64mb. It will take time to process that request in your backend and send back the response within 60s.
So either you could try to increase the nginx timeout by adding the below block in your nginx.conf file
http{
...
proxy_read_timeout 300;
proxy_connect_timeout 300;
proxy_send_timeout 300;
keepalive_timeout 3000;
...
}
Or better way would be, dont process the data immediately, push the data to a message queue and send the response instantly.let the background workers handle the process with data. I dont know much about laravel. In django we can use rabbitmq and celery/ pika.
To get the result for the request with huge data you can poll the server at regular interval or setup a websocket connection

Nginx Server Configuration: Hostname not resolving on Subdomain

thank you in advance for your support.
I set up an Ubuntu Server with Nginx as a Digitalocean Droplet and am using the server provisioning tool Laravel Forge, which works fine. I successfully installed the PHP Framework Laravel and deployed the code on the server. I ssh into the server and checked the files in the supposed directory. The code is successfully deployed.
Next I own a domain, and created an A record for the following subdomain: app.mywebsite.de, that points to that server. I followed the digitalocean instructions and I waited the required time. Using a DNS Lookup tool, I confirmed that the subdomain actually points to the server.
Screenshot of DNS Lookup
Yet, when I use the subdomain in my browser, the browser doesn't connect to the server. I get the following error message in the browser:
Screenshot of Browser when connecting to server
It seems like the subdomain is correctly pointed to the server, but the server isn't rightly configured. I tried to check the nginx configuration and under sites-avaialble I have the following configuration for the subdomain:
# FORGE CONFIG (DO NOT REMOVE!)
include forge-conf/app.mywebsite.de/before/*;
server {
listen 80;
listen [::]:80;
server_name app.mywebsite.de;
server_tokens off;
root /home/forge/app.mywebsite.de/public;
# FORGE SSL (DO NOT REMOVE!)
# ssl_certificate;
# ssl_certificate_key;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers TLS13-AES-256-GCM-SHA384:TLS13-CHACHA20-POLY1305-SHA256:TLS_AES_256_GCM_SHA384:TLS-AES-256-GCM-SHA384:TLS_CHACHA20_POLY1305_SHA256:TLS-CHACHA20-POLY1305-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA:ECDHE-RSA-AES256-SHA;
ssl_prefer_server_ciphers on;
ssl_dhparam /etc/nginx/dhparams.pem;
add_header X-Frame-Options "SAMEORIGIN";
add_header X-XSS-Protection "1; mode=block";
add_header X-Content-Type-Options "nosniff";
index index.html index.htm index.php;
charset utf-8;
# FORGE CONFIG (DO NOT REMOVE!)
include forge-conf/app.mywebsite.de/server/*;
location / {
try_files $uri $uri/ /index.php?$query_string;
}
location = /favicon.ico { access_log off; log_not_found off; }
location = /robots.txt { access_log off; log_not_found off; }
access_log off;
error_log /var/log/nginx/app.mywebsite.de-error.log error;
error_page 404 /index.php;
location ~ \.php$ {
fastcgi_split_path_info ^(.+\.php)(/.+)$;
fastcgi_pass unix:/var/run/php/php8.0-fpm.sock;
fastcgi_index index.php;
include fastcgi_params;
}
location ~ /\.(?!well-known).* {
deny all;
}
}
# FORGE CONFIG (DO NOT REMOVE!)
include forge-conf/app.mywebsite.de/after/*;
In forge-conf/app.website.de/before/*; is only one file redirect.conf with the following code:
server {
listen 80;
listen [::]:80;
server_name www.app.website.de;
if ($http_x_forwarded_proto = 'https') {
return 301 https://app.website.de$request_uri;
}
return 301 $scheme://app.website.de$request_uri;
}
There are no other sites on the server. So there is only the 000-catch-all file in the sites available directory of the nginx configuration folder.
Unfortunately I reached my limit of understanding here and I would love if somebody could point me into the right direction to find out, which part of nginx is not configured corectly.
P.S.
Some additional info:
Yes I restarted Nginx and the whole server multiple times.
Turns out, everything was configured correctly. I didn't change anything, except that I added some additional sites on the nginx server. Forge probably updated the server blocks, which resolved the problem.

HTTP/2 server pushed assets fail to load (HTTP2_CLIENT_REFUSED_STREAM)

I have the following error ERR_HTTP2_CLIENT_REFUSED_STREAM in chrome devtools console for all or some of the assets pushed via http2.
Refreshing the page and clearing the cache randomly fixes this issue partially and sometimes completely.
I am using nginx with http2 enabled (ssl via certbot) and cloudflare.
server {
server_name $HOST;
root /var/www/$HOST/current/public;
add_header X-Frame-Options "SAMEORIGIN";
add_header X-XSS-Protection "1; mode=block";
add_header X-Content-Type-Options "nosniff";
index index.html index.htm index.php;
charset utf-8;
set $auth "dev-Server.";
if ($request_uri ~ ^/opcache-api/.*$){
set $auth off;
}
auth_basic $auth;
auth_basic_user_file /etc/nginx/.htpasswd;
location / {
try_files $uri $uri/ /index.php?$query_string;
}
location = /favicon.ico { access_log off; log_not_found off; }
location = /robots.txt { access_log off; log_not_found off; }
error_page 404 /index.php;
location ~ \.php$ {
fastcgi_pass unix:/var/run/php/php7.2-fpm.sock;
fastcgi_index index.php;
fastcgi_param SCRIPT_FILENAME $realpath_root$fastcgi_script_name;
include fastcgi_params;
}
location ~ /\.(?!well-known).* {
deny all;
}
location ~* \.(?:css|js)$ {
access_log off;
log_not_found off;
# Let laravel query strings burst the cache
expires 1M;
add_header Cache-Control "public";
# Or force cache revalidation.
# add_header Cache-Control "public, no-cache, must-revalidate";
}
location ~* \.(?:jpg|jpeg|gif|png|ico|xml|svg|webp)$ {
access_log off;
log_not_found off;
expires 6M;
add_header Cache-Control "public";
}
location ~* \.(?:woff|ttf|otf|woff2|eot)$ {
access_log off;
log_not_found off;
expires max;
add_header Cache-Control "public";
types {font/opentype otf;}
types {application/vnd.ms-fontobject eot;}
types {font/truetype ttf;}
types {application/font-woff woff;}
types {font/x-woff woff2;}
}
listen 443 ssl http2;
ssl_certificate /etc/letsencrypt/live/$HOST/fullchain.pem; # managed by Certbot
ssl_certificate_key /etc/letsencrypt/live/$HOST/privkey.pem; # managed by Certbot
# include /etc/letsencrypt/options-ssl-nginx.conf; # managed by Certbot
ssl_ciphers EECDH+CHACHA20:EECDH+AES128:RSA+AES128:EECDH+AES256:RSA+AES256:EECDH+3DES:RSA+3DES:!MD5;
ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem; # managed by Certbot
}
server {
if ($host = $HOST) {
return 301 https://$host$request_uri;
} # managed by Certbot
server_name $HOST;
listen 80;
return 404; # managed by Certbot
}
Googling this error doesn't return much results and if it helps, it's a laravel 6 app that pushes those assets, if i disable assets pushing in laravel, then all assets load correctly.
I don't even know where to start looking.
Update 1
I enabled chrome logging, and inspected the logs using Sawbuck, following the instructions provided here and found that the actual error has some relation with a 414 HTTP response, which implies some caching problem.
Update 2
I found this great The browser can abort pushed items if it already has them which states the following:
Chrome will reject pushes if it already has the item in the push cache. It rejects with PROTOCOL_ERROR rather than CANCEL or REFUSED_STREAM.
it also links to some chrome and mozilla bugs.
Which led me to disable cloudflare completely and test directly with the server, i tried various Cache-Control directives and also tried disabling the header, but the same error occures, upon a refresh after a cache clear.
Apparently, chrome cancel the http/2 pushed asset even when not present in push cache, leaving the page broken.
For now, i'm disabling http/2 server push in laravel app as a temporary fix.
We just got the exact same problem you're describing. We got "net::ERR_HTTP2_CLIENT_REFUSED_STREAM" on one of our Javascript files. Reloading and clearing cache worked but then the problem came back, seemingly randomly. The same issue in Chrome and Edge (Chromium based). Then I tried in Firefox and got the same behavior but Firefox complained that the response for that url was "text/html". My guess is that for some reason we had gotten a "text/html" response cached for that url in Cloudflare. When I opened that url directly in Firefox I got "application/javascript" and then the problem went away. Still not quite sure how this all happened though.
EDIT:
In our case it turned out that the response for a .js file was blocked by the server with a 401 and we didn't send out any cache headers. Cloudflare tries to be helpful because the browser was expecting a .js file so the response was cached even if the status was 401. Which later failed for others because we tried to http2 push a text/html response with status 401 as a .js file. Luckliy Firefox gave us a better, actionable error message.
EDIT2:
Turns out it wasn't a http header cache issue. It was that we had cookie authentication on .js files, and it seems that the http2 push requests doesn't always include cookies. The fix was to allow cookieless requests on the .js files.
For anyone who comes here using Symfony's local development server (symfony server:start), I could not fix it but this setting (which is the default setting) stops the server to try to push preloaded assets and the error goes away:
config/packages/dev/webpack_encore.yaml
webpack_encore:
preload: false

Purging nginx cache files does not always work

I run an nginx server + PHP webservices API. I use nginx's fastcgi_cache to cache all GET requests, and when certain resources are updated, I purge one or more related cached resources.
The method I'm using to do this is to calculate the nginx cache file name for each resource I want to purge, and then deleting that file. For the most part, this works well.
However, I've found that sometimes, even after the cache file is deleted, nginx will still return data from cache.
This is not a problem with selecting the correct cache file to delete -- as part of my testing, I've deleted the entire cache directory, and nginx still returns HIT responses
Is anyone aware of why this might be happening? Is it possible that another cache is involved? E.g., could it be that the OS is returning a cached version of the cache file to nginx, so nginx is not aware that it's been deleted?
I'm running this on CentOS, and with this nginx config (minus irrelevant parts):
user nginx;
# Let nginx figure out the best value
worker_processes auto;
events {
worker_connections 10240;
multi_accept on;
use epoll;
}
# Maximum number of open files should be at least worker_connections * 2
worker_rlimit_nofile 40960;
# Enable regex JIT compiler
pcre_jit on;
http {
# TCP optimisation
sendfile on;
tcp_nodelay on;
tcp_nopush on;
# Configure keep alive
keepalive_requests 1000;
keepalive_timeout 120s 120s;
# Configure SPDY
spdy_headers_comp 2;
# Configure global PHP cache
fastcgi_cache_path /var/nginx/cache levels=1:2 keys_zone=xxx:100m inactive=24h;
# Enable open file caching
open_file_cache max=10000 inactive=120s;
open_file_cache_valid 120s;
open_file_cache_min_uses 5;
open_file_cache_errors off;
server {
server_name xxx;
listen 8080;
# Send all dynamic content requests to the main app handler
if (!-f $document_root$uri) {
rewrite ^/(.+) /index.php/$1 last;
rewrite ^/ /index.php last;
}
# Proxy PHP requests to php-fpm
location ~ [^/]\.php(/|$) {
# Enable caching
fastcgi_cache xxx;
# Only cache GET and HEAD responses
fastcgi_cache_methods GET HEAD;
# Caching is off by default, an can only be enabled using Cache-Control response headers
fastcgi_cache_valid 0;
# Allow only one identical request to be forwarded (others will get a stale response)
fastcgi_cache_lock on;
# Define conditions for which stale content will be returned
fastcgi_cache_use_stale error timeout invalid_header updating http_500 http_503;
# Define cache key to uniquely identify cached objects
fastcgi_cache_key "$scheme$request_method$host$request_uri";
# Add a header to response to indicate cache results
add_header X-Cache $upstream_cache_status;
# Configure standard server parameters
fastcgi_split_path_info ^(.+\.php)(/.+)$;
include fastcgi_params;
# php-fpm config
fastcgi_param SCRIPT_URL $fastcgi_path_info;
fastcgi_param PATH_INFO $fastcgi_path_info;
fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
fastcgi_param REQUEST_SCHEME $scheme;
fastcgi_param REMOTE_USER $remote_user;
# Read buffer sizes
fastcgi_buffer_size 128k;
fastcgi_buffers 256 16k;
fastcgi_busy_buffers_size 256k;
fastcgi_temp_file_write_size 256k;
# Keep connection open to enable keep-alive
fastcgi_keep_conn on;
# Proxy to PHP
fastcgi_pass unix:/var/run/php-fpm/fpm.sock;
}
}
}
Now that I look at this, could the open_file_cache parameters be affecting the cache files?
Any ideas?
No, the OS does not cache files.
However, the reason this might be happening is that files are not actually fully deleted until both the link count and the number of processes that have the file open both go down to zero.
The unlink(2) manual page, which documents the system call used by tools like rm, reads as follows:
The unlink() function removes the link named by path from its directory and decrements the link count of the file which was referenced by the link. If that decrement reduces the link count of the file to zero, and no process has the file open, then all resources associated with the file are reclaimed. If one or more processes have the file open when the last link is removed, the link is removed, but the removal of the file is delayed until all references to it have been closed.
Depending on the system, you can actually still recover such open files fully without any data loss, for example, see https://unix.stackexchange.com/questions/61820/how-can-i-access-a-deleted-open-file-on-linux-output-of-a-running-crontab-task.
So, indeed, open_file_cache would effectively preclude your deletion from having any effect within the processes that still have relevant file descriptors in their cache. You may want to use a shorter open_file_cache_valid if urgent purging after deletion is very important to you.

Resources