We are running server with this configuration: Nginx + FastCGI (PHP-FPM) + PHP 7.3 + MySQL 5.7. However, we experience some issues with long running PHP scripts, so we have setup these restrictions:
php.ini
...
max_execution_time = 60
max_input_time = 60
...
pool.d/xxx.conf (pool config)
...
pm = dynamic
pm.max_children = 20
pm.max_requests = 500
request_terminate_timeout = 300
...
nginx virtualhost
location ~ \.php$ {
...
fastcgi_connect_timeout 300s;
fastcgi_read_timeout 300s;
fastcgi_send_timeout 300s;
fastcgi_keep_conn on;
...
}
When we check our php-fpm logs, we see that some workers are SIGKILL-ed / SIGTERM-ed, but what is important here, is the time, after which workers are killed. This time is greater than all possible timeouts set in different config files (see above). For example the script below (Apr 20 19:09:24) was timed out after 400s and killed after 428s. And the second one (Apr 20 19:05:34) was killed after 427s but processed only 1 request (according to FPM status page). So that one request has duration of 427s... But the php.ini timeout is 60s and other timeouts (nginx, php-fpm) are set to 300s, not more.
Apr 20 19:09:24 srv01 php-fpm[21093]: [WARNING] [pool xxx] child 31032, script '/example.com/index.php' (request: "GET /index.php") execution timed out (399.965025 sec), terminating
Apr 20 19:09:24 srv01 php-fpm[21093]: [WARNING] [pool xxx] child 31032 exited on signal 15 (SIGTERM) after 428.563463 seconds from start
...
Apr 20 19:05:34 srv01 php-fpm[21093]: [WARNING] [pool xxx] child 30951 exited on signal 9 (SIGKILL) after 427.516062 seconds from start
Apr 20 19:05:34 srv01 php-fpm[21093]: [WARNING] [pool xxx] child 30985 exited on signal 9 (SIGKILL) after 334.407532 seconds from start
Apr 20 19:05:34 srv01 php-fpm[21093]: [WARNING] [pool xxx] child 30960 exited on signal 9 (SIGKILL) after 411.221728 seconds from start
Apr 20 19:05:34 srv01 php-fpm[21093]: [WARNING] [pool xxx] child 31031 exited on signal 9 (SIGKILL) after 210.926257 seconds from start
So my question is, how is this possible? Do we miss some another config directive (in nginx/php-fpm) to restrict these scripts to our desired maximum duration? Or should this issues be related with some PHP-FPM/Nginx communication problems mentioned for example here:
- https://github.com/perusio/drupal-with-nginx/issues/55 (this is related with "fastcgi_keep_conn" vs "dynamic" PHP-FPM process manager)
- https://bugs.php.net/bug.php?id=63395
It seems like that the problem is not in MySQL (all processes are idle here, when long running scripts are logged).
We are also working on application optimalization (which should eliminate these long running scripts in the future), but until then, we need to tighten up the server, as we can run the application without problems. So we would like to know what else should we try to limit these scripts to run for maximum of 300s.
Thank you for all your help!
Three possibilities come to mind, register_shutdown_function(), process forking via pcntl_fork(), and lastly -- how time bookkeeping is done.
There is a note on the page about shutdown functions that states:
So max_execution_time will not stop a shutdown function if there is one.
Forking a process on the other hand will allow the child process to go on as long as it likes, so you have to be careful when allowing child processes.
Finally, some of these timers only apply to CPU time, so if the process is 'waiting' and not really doing any work, the timer isn't running for it. Time spent calling
sleep()
for example will not count for the max_execution_time limit, so it can go on sleeping for a very long time.