We were able to trace down a problem that is crashing our NGINX server running Magento until the following point:
Background info: Magento Backend has a CMS function with a WYSIWYG editor. This editor loads some pictures via a controller in magento (cms/directive).
When we set the NGINX error_log level to info, we get the following lines (line break inserted for better readability):
2012/10/22 18:05:40 [info] 14105#0: *1 client closed prematurely connection,
so upstream connection is closed too while sending request to upstream, client:
XXXXXXXXX, server: test.local, request: "GET
index.php/admin/cms_wysiwyg/directive/___directive/BASEENCODEDIMAGEURL,,/
HTTP/1.1",
upstream: "fastcgi://127.0.0.1:9024", host: "test.local"
When checking the code in the debugger, the following call does never return (in ´Varien_Image_Adapter_Abstract::getMimeType()`
# $this->_fileName is http://test.local/skin/adminhtml/base/default/images/demo-image-not-existing.gif`
# $_SERVER['REQUEST_URI'] = http://test.local/admin/cms_wysiwyg/directive/___directive/BASEENCODEDIMAGEURL
list($this->_imageSrcWidth, $this->_imageSrcHeight, $this->_fileType, ) = getimagesize($this->_fileName);
The filename requests is
- an URL to the same server which is requesting the script
- a link to a static .gif that is not existing.
Sample URL:
http://test.local/skin/adminhtml/base/default/images/demo-image-not-existing.gif
When the above line executed, any subsequent request to the NGNIX server does not respond any more. After waiting for around 10 minutes, the NGINX server starts answering requests again.
I tried to reproduce the error with a simple test script that only calls getimagesize()
with the given URL - but this not crash. It simple leads to an exception saying that the URL could not be loaded (which is fine as the URL is wrong)
Current theory:
NGINX / PHP FCGI has a limited number of processes it can handle. The CMS WYSIWYG editor fires around 5 parallel requests on the
cms_wysiwyg/directive
action which NGINX tries to complete. Lets say NGINX can handle only 5 parallel requests: Now NGINX makes an additional request inside one of these running requests to itself which of course can not be fulfilled because the slots are full. The slots also can not be released, because fulfilling one request depends on making one additional request.Possible solutions:
I just had almost the same issue here. After viewing a CMS-block containing an uploaded image, the entire php/php-fpm setup became unresponsive.
The problem turned out to be a call to getimagesize (in my case around line 72 in lib/Varien/Image/Adapter/Gd2.php). Even though the image file in question was located on the site itself, the parameter to getimagesize was an HTTP URL. Because of a weird firewall configuration, the server was unable to contact itself via HTTP, so request hang and for reasons unknown, php-fpm stopped serving requests all together.
Finally nginx lost patience logged a timeout error.
After allowing the server to make HTTP requests to itself, the timeout errors went away.
I'm still puzzled why Magento would access a local file through HTTP, but it was probably the easiest way to support external images in the wysiwyg editor.
(Magento 1.8.1.0)