nginx php5-fpm 上游连接到上游时超时(110:连接超时)
我们有一个使用 nginx php5-fpm apc 设置运行的 Web 服务器。 然而,我们最近在页面渲染过程中遇到了上游连接超时错误和速度减慢的情况。快速重启 php5-fpm 解决了问题,但我们找不到原因。
我们有另一个 Web 服务器在另一个子域下运行 apache2,连接相同的数据库,执行完全相同的工作。但速度下降仅发生在 nginx-fpm 服务器上。 我认为 php5-fpm 或 apc 可能会导致问题。
日志显示各种连接超时:
连接到上游 bla bla bla 时上游超时(110:连接超时)
php5-fpm 日志不显示任何内容。只是子进程开始和结束:
Apr 07 22:37:27.562177 [NOTICE] [pool www] child 29122 started
Apr 07 22:41:47.962883 [NOTICE] [pool www] child 28346 exited with code 0 after 2132.076556 seconds from start
Apr 07 22:41:47.963408 [NOTICE] [pool www] child 29172 started
Apr 07 22:43:57.235164 [NOTICE] [pool www] child 28372 exited with code 0 after 2129.135717 seconds from start
发生错误时服务器未加载,平均负载仅为 2(2cpus 16cores),并且 php5-fpm 进程似乎工作正常。
nginxconf:
user www-data;
worker_processes 14;
pid /var/run/nginx.pid;
# set open fd limit to 30000
worker_rlimit_nofile 30000;
events {
worker_connections 768;
# multi_accept on;
}
http {
##
# Basic Settings
##
sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
types_hash_max_size 2048;
# server_tokens off;
# server_names_hash_bucket_size 64;
# server_name_in_redirect off;
include /etc/nginx/mime.types;
default_type application/octet-stream;
##
# Logging Settings
##
access_log /var/log/nginx/access.log;
error_log /var/log/nginx/error.log;
##
# Gzip Settings
##
gzip on;
gzip_disable "msie6";
# gzip_vary on;
# gzip_proxied any;
# gzip_comp_level 6;
# gzip_buffers 16 8k;
# gzip_http_version 1.1;
# gzip_types text/plain text/css application/json application/x-javascript text/xml application/xml application/xml+rss text/javascript;
##
# Virtual Host Configs
##
include /etc/nginx/conf.d/*.conf;
include /etc/nginx/sites-enabled/*;
}
启用nginx的站点conf:
location ~* \.php$ {
fastcgi_split_path_info ^(.+\.php)(.*)$;
fastcgi_pass backend;
fastcgi_index index.php;
fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
include fastcgi_params;
fastcgi_param QUERY_STRING $query_string;
fastcgi_param REQUEST_METHOD $request_method;
fastcgi_param CONTENT_TYPE $content_type;
fastcgi_param CONTENT_LENGTH $content_length;
fastcgi_intercept_errors off;
fastcgi_ignore_client_abort off;
fastcgi_connect_timeout 20;
fastcgi_send_timeout 20;
fastcgi_read_timeout 180;
fastcgi_buffer_size 128k;
fastcgi_buffers 4 256k;
fastcgi_busy_buffers_size 256k;
fastcgi_temp_file_write_size 256k;
}
## Disable viewing .htaccess & .htpassword
location ~ /\.ht {
deny all;
}
}
upstream backend {
server 127.0.0.1:9000;
}
fpmconf:
pm.max_children = 500
pm.start_servers = 100
pm.min_spare_servers = 50
pm.max_spare_servers = 100
pm.max_requests = 10000
fpmconf文件中有紧急重启设置。 不知道他们能帮我们解决这个问题吗?
emergency_restart_interval = 0
We have a web server running with nginx php5-fpm apc setup.
However we experienced upstream connection timeout errors and slow downs during page rendering recently. A quick php5-fpm restart fixed the problem but we could not find the cause.
We have another web server running apache2 under another subdomain, connecting the same database, doing exact same job. But the slow downs occur only on the nginx-fpm server.
I think the php5-fpm or apc may cause the problems.
Logs tell that various connection time outs:
upstream timed out (110: Connection timed out) while connecting to upstream bla bla bla
The php5-fpm log does not show anything. Just child starts and finishes:
Apr 07 22:37:27.562177 [NOTICE] [pool www] child 29122 started
Apr 07 22:41:47.962883 [NOTICE] [pool www] child 28346 exited with code 0 after 2132.076556 seconds from start
Apr 07 22:41:47.963408 [NOTICE] [pool www] child 29172 started
Apr 07 22:43:57.235164 [NOTICE] [pool www] child 28372 exited with code 0 after 2129.135717 seconds from start
Server was not loaded when the error occured and load avg was just 2 (2cpus 16cores) and the php5-fpm processes seemed to be working fine.
nginx conf:
user www-data;
worker_processes 14;
pid /var/run/nginx.pid;
# set open fd limit to 30000
worker_rlimit_nofile 30000;
events {
worker_connections 768;
# multi_accept on;
}
http {
##
# Basic Settings
##
sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
types_hash_max_size 2048;
# server_tokens off;
# server_names_hash_bucket_size 64;
# server_name_in_redirect off;
include /etc/nginx/mime.types;
default_type application/octet-stream;
##
# Logging Settings
##
access_log /var/log/nginx/access.log;
error_log /var/log/nginx/error.log;
##
# Gzip Settings
##
gzip on;
gzip_disable "msie6";
# gzip_vary on;
# gzip_proxied any;
# gzip_comp_level 6;
# gzip_buffers 16 8k;
# gzip_http_version 1.1;
# gzip_types text/plain text/css application/json application/x-javascript text/xml application/xml application/xml+rss text/javascript;
##
# Virtual Host Configs
##
include /etc/nginx/conf.d/*.conf;
include /etc/nginx/sites-enabled/*;
}
nginx enabled site conf:
location ~* \.php$ {
fastcgi_split_path_info ^(.+\.php)(.*)$;
fastcgi_pass backend;
fastcgi_index index.php;
fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
include fastcgi_params;
fastcgi_param QUERY_STRING $query_string;
fastcgi_param REQUEST_METHOD $request_method;
fastcgi_param CONTENT_TYPE $content_type;
fastcgi_param CONTENT_LENGTH $content_length;
fastcgi_intercept_errors off;
fastcgi_ignore_client_abort off;
fastcgi_connect_timeout 20;
fastcgi_send_timeout 20;
fastcgi_read_timeout 180;
fastcgi_buffer_size 128k;
fastcgi_buffers 4 256k;
fastcgi_busy_buffers_size 256k;
fastcgi_temp_file_write_size 256k;
}
## Disable viewing .htaccess & .htpassword
location ~ /\.ht {
deny all;
}
}
upstream backend {
server 127.0.0.1:9000;
}
fpm conf:
pm.max_children = 500
pm.start_servers = 100
pm.min_spare_servers = 50
pm.max_spare_servers = 100
pm.max_requests = 10000
There are emergency restart settings in fpm conf file.
I do not know if they help us fix the issue?
emergency_restart_interval = 0
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
首先,将 PHP-FPM
max_requests
减少到 100;您希望 PHP 线程比 10000 个请求更快地重新启动。其次,您只有一个 PHP 进程在运行,并且有很多子进程。这对于开发来说很好,但在生产中,您希望拥有更多的 PHP 进程,每个进程具有更少的子进程,这样,如果该进程因任何原因出现故障,则其他进程可以填补空缺。因此,不要采用现在 1:50 的比例,而应采用 10:5 的比例。这将更加更加稳定。
为了实现这一点,您可能需要使用 supervisor 之类的工具来管理您的 PHP 进程。我们在生产中使用它,它确实有助于增加我们的正常运行时间并减少我们花在管理/监控服务器上的时间。这是我们的配置示例:
/etc/php5/php-fpm.conf:
/etc/supervisor.d/php-fpm.conf:
/etc /nginx/conf/php.backend:
编辑:
与所有服务器设置一样,不要依靠猜测来追踪问题所在。我建议安装 Munin 以及各种 PHP(-FPM) 和 Nginx 插件;这些将帮助您跟踪有关请求、响应时间、内存使用情况、磁盘访问、线程/进程级别的硬统计数据......在跟踪问题所在时,所有这些都是必不可少的。
此外,正如我在下面的评论中提到的,在您的设置中添加服务器端和客户端缓存,即使是在适度的级别,也可以帮助为用户提供更好的体验,无论是使用 nginx 的本机缓存支持还是更具体的东西,比如 varnishd。即使是最动态的网站/应用程序也有许多可以保存在内存和应用程序中的静态元素。服务更快。从缓存中提供这些服务可以帮助减少总体负载,并确保那些绝对需要动态的元素在需要时拥有所需的所有资源。
Firstly, reduce the PHP-FPM
max_requests
to 100; you want PHP threads to restart much sooner than 10000 req's.Secondly, you've only got one PHP process running with lots of children. This is fine for development, but in production you want to have more PHP processes each with fewer children, so that if that process goes down for any reason there are others which can take up the slack. So, rather than a ratio of 1:50 as you have now, go for a ratio of 10:5. This will be much more stable.
To achieve this you may want to look at something like supervisor to manage your PHP processes. We use this in production and it has really helped increase our uptime and reduce the amount of time we spend managing/monitoring the servers. Here's an example of our config:
/etc/php5/php-fpm.conf:
/etc/supervisor.d/php-fpm.conf:
/etc/nginx/conf/php.backend:
EDIT:
As with all server set-ups, don't rely on guess-work to track down where your issues are. I recommend installing Munin along with the various PHP(-FPM) and Nginx plugins; these will help you track hard statistics on requests, response times, memory usage, disk accesses, thread/process levels... all essential when tracking down where the issues are.
In addition, as I mentioned in a comment below, adding both server- and client-side caching to your set-up, even at a modest level, can aid in providing a better experience for users, whether it's using nginx's native caching support or something more specific like varnishd. Even the most dynamic of sites/apps have many static elements which can be held in memory & served faster. Serving these from cache can help reduce the load overall and ensure that those elements which absolutely need to be dynamic have all the resources they need when they need them.