Ping a Specific Port

Question

Stefano

Asked: 2011-11-29 17:27:04 +0800 CST2011-11-29 17:27:04 +0800 CST 2011-11-29 17:27:04 +0800 CST

apache+mod_wsgi configuration for django project(s) on a quad core

772

I've been experiment quite some time with a "typical" django setting upon nginx+apache2+mod_wsgi+memcached(+postgresql) (reading the doc and some questions on SO and SF, see comments)

Since I'm still unsatisfied with the behavior (definitely because of some bad misconfiguration on my part) I would like to know what a good configuration would look like with these hypotesis:

Quad-Core Xeon 2.8GHz
8 gigs memory
several django projects (anything special related to this?)

These are excerpts form my current confs:

EDIT: I've added more stuff to make this complete but following Graham's suggestion I will follow up on the wsgi mailing list

apache 2 (>apache2 -v)

Server version: Apache/2.2.12 (Ubuntu)
Server built:   Nov 18 2010 21:16:51
Server's Module Magic Number: 20051115:23
Server loaded:  APR 1.3.8, APR-Util 1.3.9
Compiled using: APR 1.3.8, APR-Util 1.3.9
Architecture:   64-bit
Server MPM:     Worker
  threaded:     yes (fixed thread count)
    forked:     yes (variable process count)
Server compiled with....
 -D APACHE_MPM_DIR="server/mpm/worker"
 -D APR_HAS_SENDFILE
 -D APR_HAS_MMAP
 -D APR_HAVE_IPV6 (IPv4-mapped addresses enabled)
 -D APR_USE_SYSVSEM_SERIALIZE
 -D APR_USE_PTHREAD_SERIALIZE
 -D SINGLE_LISTEN_UNSERIALIZED_ACCEPT
 -D APR_HAS_OTHER_CHILD
 -D AP_HAVE_RELIABLE_PIPED_LOGS
 -D DYNAMIC_MODULE_LIMIT=128
 -D HTTPD_ROOT=""
 -D SUEXEC_BIN="/usr/lib/apache2/suexec"
 -D DEFAULT_PIDLOG="/var/run/apache2.pid"
 -D DEFAULT_SCOREBOARD="logs/apache_runtime_status"
 -D DEFAULT_ERRORLOG="logs/error_log"
 -D AP_TYPES_CONFIG_FILE="/etc/apache2/mime.types"
 -D SERVER_CONFIG_FILE="/etc/apache2/apache2.conf"

apache2 conf

PidFile ${APACHE_PID_FILE}
Timeout 60
KeepAlive Off

ServerSignature Off
ServerTokens Prod
#MaxKeepAliveRequests 100
#KeepAliveTimeout 15
# worker MPM
<IfModule mpm_worker_module>
    StartServers          2
    ServerLimit           4
    MinSpareThreads       2
    MaxSpareThreads       4
    ThreadLimit          32
    ThreadsPerChild      16
    MaxClients          64#128
    MaxRequestsPerChild   10000
</IfModule>

...

SetEnv VHOST null 
#WSGIPythonOptimize 2

<VirtualHost *:8082>
    ServerName subdomain.domain.com
    ServerAlias www.domain.com
    SetEnv VHOST subdomain.domain
    AddDefaultCharset UTF-8
    ServerSignature Off

    LogFormat "%{X-Real-IP}i %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\"" custom
    ErrorLog  /home/project1/var/logs/apache_error.log
    CustomLog /home/project1/var/logs/apache_access.log custom

    AllowEncodedSlashes On

    WSGIDaemonProcess subdomain.domain user=www-data group=www-data threads=25
    WSGIScriptAlias / /home/project1/project/wsgi.py
    WSGIProcessGroup %{ENV:VHOST}
</VirtualHost>

wsgi.py

Currently using version 3.3 built from source

import os
import sys

# setting all the right paths....


_realpath = os.path.realpath(os.path.dirname(__file__))
_public_html = os.path.normpath(os.path.join(_realpath, '../'))    

sys.path.append(_realpath)
sys.path.append(os.path.normpath(os.path.join(_realpath, 'apps')))
sys.path.append(os.path.normpath(_public_html))
sys.path.append(os.path.normpath(os.path.join(_public_html, 'libs')))
sys.path.append(os.path.normpath(os.path.join(_public_html, 'django')))


os.environ['DJANGO_SETTINGS_MODULE'] = 'settings'

import django.core.handlers.wsgi

_application = django.core.handlers.wsgi.WSGIHandler()

def application(environ, start_response):
    """
    Launches django passing over some environment (domain name) settings
    """

    application_group = environ['mod_wsgi.application_group']
    """
    wsgi application group is required. It's also used to generate the
    HOST.DOMAIN.TLD:PORT parameters to pass over
    """
    assert application_group
    fields = application_group.replace('|', '').split(':')
    server_name = fields[0]
    os.environ['WSGI_APPLICATION_GROUP'] = application_group
    os.environ['WSGI_SERVER_NAME'] = server_name
    if len(fields) > 1 :
        os.environ['WSGI_PORT'] = fields[1]
    splitted = server_name.rsplit('.', 2)    
    assert splitted >= 2
    splited.reverse()
    if len(splitted) > 0 :
        os.environ['WSGI_TLD'] = splitted[0]
    if len(splitted) > 1 :
        os.environ['WSGI_DOMAIN'] = splitted[1]
    if len(splitted) > 2 :
        os.environ['WSGI_HOST'] = splitted[2]
    return _application(environ, start_response)`

folder structure

in case it matters (slightly shortened actually)

/home/www-data/projectN/var/logs
                       /project (contains manage.py, wsgi.py, settings.py)
                       /project/apps (all the project ups are here)
                       /django
                       /libs

Please forgive me in advance if I overlooked something obvious.

My main question is about the apache2 wsgi settings. Are those fine? Is 25 threads an /ok/ number with a quad core for one only django project? Is it still ok with several django projects on different virtual hosts? Should I specify 'process'? Any other directive which I should add? Is there anything really bad in the wsgi.py file?

I've been reading about potential issues with the standard wsgi.py file, should I switch to that?

Or.. should this conf just be running fine, and I should look for issues somewhere else?

So, what do I mean by "unsatisfied": well, I often get quite high CPU WAIT; but what is worse, is that relatively often apache2 gets stuck. It just does not answer anymore, and has to be restarted. I have setup a monit to take care of that, but it ain't a real solution. I have been wondering if it's an issue with the database access (postgresql) under heavy load, but even if it was, why would the apache2 processes get stuck?

Beside these two issues, performance is overall great. I even tried New Relic and got very good average results.

edit I will not be able to provide an answer myself as I have, temporarily, moved to an nginx+gunicorn environment.

Also follow up on google groups for my personal situation and issues! Sounds like Graham is, of course, really busy (mod_wsgi is a free side project!) but moving to Read The Docs sounds great, and solving that one backlog issue would be totally awesome. That and the new Apache 2.4 might make me reconsider the best combo (currently nginx+gunicorn, then I might drop nginx for a varnish+apache+mod_wsgi setting)

1 Answers

Voted

Graham Dumpleton · Answer 1 · 2011-11-29T19:19:40+08:00

Enable mod_headers in Apache and then add to your VirtualHost:

RequestHeader add X-Queue-Start "%t"

New Relic will then show you queueing time in the main overview chart.

This is the time between when request is first accepted by Apache child worker process and when mod_wsgi daemon process gets to handle the request. This can be used as one indicator of requests getting backlogged which in turn can indicate thread starvation in daemon process due to deadlocked threads or threads waiting on an external resource.

Unfortunately New Relic relies on requests completing to report data for that request. So if a request gets stuck you will not know about it. If all threads get stuck then daemon process will stop handling more requests.

Problem is that if number of processes/threads across Apache child worker processes is less than 100, the daemon process listener backlog, then all those threads can also get stuck and you will not know from Apache error logs because they will just sit there waiting for daemon to accept connection which never happens. Only the HTTP browser client will know as it will get connection refused when Apache child worker socket backlog fills up.

In mod_wsgi 4.0 I will be adding ability to configure the listener backlog for daemon process so can be reduced so you might get an error of some sort. There are already new options in mod_wsgi 4.0 to look for blocked threads and to restart daemon processes automatically and also dump a stack trace of where blocked threads were in code at the time.

To get that you would need to use mod_wsgi 4.0 dev code from mod_wsgi repo. You could then set blocked-timeout on WSGIDaemonProcess to 60 seconds and when all threads get stuck it will do the restart and recover, plus dump stack traces. Am still tweaking this and there are other configuration options related to this why don't describe here.

The mod_wsgi 4.0 code also has some other features which can be used with custom charting in New Relic to track growing number of blocked threads. Am not happy with it and needs to be changed a bit, but is stable.

Anyway, jump onto the mod_wsgi mailing list to discuss further.

apache+mod_wsgi configuration for django project(s) on a quad core

apache 2 (>apache2 -v)

apache2 conf

wsgi.py

folder structure

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Resolve host name from IP address

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?