I've been knocking my brains on this for a week...
Setup: Ubuntu 16.04 LAMP stack, php 5.6 (yes I know I need to upgrade this). The server runs many sites for one organization, one of which gets constant public usage. The others are mostly subdomains for narrow usages. Most of the sites are open source applications. On AWS T2.large (2 processors, 8GB RAM). Storage is SSD. Single server hosts both apache and MySQL.
This set up has run really well for a couple of years. In June there was a period of about 10 days where server load shot up super high (100+) and I did some tuning but ultimately solved the problem by creating a new instance of the server. Chalked up to degraded hardware (?).
It started happening again about 10 days ago. The pattern is interesting. It's not totally consistent but there tends to be a spike in load every 30 minutes. I see no cron jobs that correlate with this. The load surges up to around 16-18. Sometimes when this happens, the server becomes unresponsive (not surprising) but it doesn't look like the proverbial "wedged server". When it goes unresponsive, the load decreases until it's well under 0.5, and stays there without recovery. Basically, it's dead. An apache restart doesn't bring it back to life, but an apache stop/apache start DOES bring it back. When it's in it's "dead" state, top shows that mysql uses about 0.2% of CPU. When it's under heavy load, mysql uses about 30-60% of 2 CPU cores (60-120% of a single core's CPU).
I can't think of anything that's changed since this started, other than a continual addition of data. The "big" database is about 1gb in size.
I tried creating a new instance for the server as in June, this time it didn't help.
I have attempted to do tuning, but with all the restarts it's hard to accumulate good data.
MYSQL has been running for a day, here's the output of tuning-primer.sh
:
-- MYSQL PERFORMANCE TUNING PRIMER --
- By: Matthew Montgomery -
MySQL Version 5.7.27-0ubuntu0.16.04.1-log x86_64
Uptime = 1 days 5 hrs 33 min 50 sec
Avg. qps = 338
Total Questions = 36005313
Threads Connected = 1
Warning: Server has not been running for at least 48hrs.
It may not be safe to use these recommendations
To find out more information on how each of these
runtime variables effects performance visit:
http://dev.mysql.com/doc/refman/5.7/en/server-system-variables.html
Visit http://www.mysql.com/products/enterprise/advisors.html
for info about MySQL's Enterprise Monitoring and Advisory Service
SLOW QUERIES
The slow query log is NOT enabled.
Current long_query_time = 10.000000 sec.
You have 210572 out of 36005334 that take longer than 10.000000 sec. to complete
Your long_query_time seems to be fine
BINARY UPDATE LOG
The binary update log is NOT enabled.
You will not be able to do point in time recovery
See http://dev.mysql.com/doc/refman/5.7/en/point-in-time-recovery.html
WORKER THREADS
Current thread_cache_size = 16
Current threads_cached = 15
Current threads_per_sec = 0
Historic threads_per_sec = 0
Your thread_cache_size is fine
MAX CONNECTIONS
Current max_connections = 300
Current threads_connected = 1
Historic max_used_connections = 41
The number of used connections is 13% of the configured maximum.
Your max_connections variable seems to be fine.
INNODB STATUS
Current InnoDB index space = 70 M
Current InnoDB data space = 195 M
Current InnoDB buffer pool free = 25 %
Current innodb_buffer_pool_size = 384 M
Depending on how much space your innodb indexes take up it may be safe
to increase this value to up to 2 / 3 of total system memory
MEMORY USAGE
Max Memory Ever Allocated : 1.75 G
Configured Max Per-thread Buffers : 2.58 G
Configured Max Global Buffers : 1.40 G
Configured Max Memory Limit : 3.98 G
Physical Memory : 7.79 G
Max memory limit seem to be within acceptable norms
KEY BUFFER
Current MyISAM index space = 853 M
Current key_buffer_size = 1.00 G
Key cache miss rate is 1 : 16883
Key buffer free ratio = 78 %
Your key_buffer_size seems to be too high.
Perhaps you can use these resources elsewhere
QUERY CACHE
Query cache is enabled
Current query_cache_size = 16 M
Current query_cache_used = 11 M
Current query_cache_limit = 1 M
Current Query cache Memory fill ratio = 72.74 %
Current query_cache_min_res_unit = 4 K
MySQL won't cache query results that are larger than query_cache_limit in size
SORT OPERATIONS
Current sort_buffer_size = 256 K
Current read_rnd_buffer_size = 256 K
Sort buffer seems to be fine
JOINS
Current join_buffer_size = 8.00 M
You have had 6245 queries where a join could not use an index properly
join_buffer_size >= 4 M
This is not advised
You should enable "log-queries-not-using-indexes"
Then look for non indexed joins in the slow query log.
OPEN FILES LIMIT
Current open_files_limit = 3910 files
The open_files_limit should typically be set to at least 2x-3x
that of table_cache if you have heavy MyISAM usage.
Your open_files_limit value seems to be fine
TABLE CACHE
Current table_open_cache = 1800 tables
Current table_definition_cache = 2200 tables
You have a total of 2189 tables
You have 1792 open tables.
Current table_cache hit rate is 2%
, while 99% of your table cache is in use
You should probably increase your table_cache
TEMP TABLES
Current max_heap_table_size = 384 M
Current tmp_table_size = 384 M
Of 238038 temp tables, 17% were created on disk
Created disk tmp tables ratio seems fine
TABLE SCANS
Current read_buffer_size = 128 K
Current table scan ratio = 125 : 1
read_buffer_size seems to be fine
TABLE LOCKING
Current Lock Wait ratio = 1 : 52
You may benefit from selective use of InnoDB.
If you have long running SELECT's against MyISAM tables and perform
frequent updates consider setting 'low_priority_updates=1'
If you have a high concurrency of inserts on Dynamic row-length tables
consider setting 'concurrent_insert=ALWAYS'.
Apache has been running for .5 hours. That won't yield useful recommendations but does show configuration data. Here's the output of apache2buddy.pl:
#####################################################################
apache2buddy.pl report for ip-xxx-xxx-xxx-xxx.ec2.internal (xxx.xxx.xxx.xxx)
#####################################################################
[ OK ] This script is being run as root.
[ OK ] The utility 'pmap' exists and is available for use: /usr/bin/pmap
[ OK ] The utility 'netstat' exists and is available for use: /bin/netstat
[ OK ] 'php' exists and is available for use: /usr/bin/php
[ OK ] The utility 'apachectl' exists and is available for use: /usr/sbin/apachectl
[ OK ] The 'python' binary exists and is available for use: /usr/bin/python
[ OK ] The port (port 80) is a valid port.
[ -- ] We are attempting to discover the operating system type and version number ...
[ -- ] Distro: Ubuntu
[ -- ] Version: 16.04
[ -- ] Codename: xenial
[ OK ] This distro is supported by apache2buddy.pl.
[ OK ] This distro version is supported by apache2buddy.pl.
[ -- ] Hostname: ip-xxx-xxx-xxx-xxx.ec2.internal
[ -- ] Primary IP: xxx.xxx.xxx.xxx
[ -- ] We are checking the service running on port 80...
[ -- ] The process listening on port 80 is /usr/sbin/apache2
[ -- ] The process running on port 80 is Apache/2.4.18 (Ubuntu).
[ -- ] The full path to the Apache config file is: /etc/apache2/apache2.conf
[ -- ] Apache is using prefork model.
[ -- ] pidfile setting is /var/run/apache2/apache2$SUFFIX.pid.
[ -- ] Actual pidfile is /var/run/apache2/apache2.pid.
[ -- ] Parent PID: 14833.
[ OK ] Memory usage of parent PID is less than 50MB: 9712 Kilobytes.
[ -- ] Apache has been running 0d 0h 08m 55s.
[ !! ] *** LOW UPTIME ***.
[ @@ ] The following recommendations may be misleading - apache has been restarted within the last 24 hours.
[ -- ] Your server has 7981 MB of PHYSICAL memory.
[ >> ] ServerLimit directive not found, assuming default values.
[ -- ] Your ServerLimit setting is 256.
[ -- ] Your MaxRequestWorkers setting is 19.
[ OK ] Current Apache Process Count is 16, including the parent PID.
[ -- ] Number of vhosts detected: 30.
[ -- ] |________ of which 23 are HTTP (specifically, port 80).
[ -- ] |________ of which 7 are HTTPS (specifically, port 443).
[ @@ ] Current Apache vHost Count is greater than maxrequestworkers, which is unusual, but can be valid in some scenarios.
[ -- ] Your MaxRequestsPerChild setting is 5000.
[ -- ] This server is NOT running Plesk.
[ -- ] This server is NOT running cPanel.
[ -- ] Virtualmin Version: 6.07.gpl
[ -- ] Webmin Version: 1.930
Use of uninitialized value $real_config in concatenation (.) or string at -
line 1212 (#1)
(W uninitialized) An undefined value was used as if it were already
defined. It was interpreted as a "" or a 0, but maybe it was a mistake.
To suppress this warning assign a defined value to your variables.
To help you figure out what was undefined, perl will try to tell you
the name of the variable (if any) that was undefined. In some cases
it cannot do this, so it also tells you what operation you used the
undefined value in. Note, however, that perl optimizes your program
and the operation displayed in the warning may not necessarily appear
literally in your program. For example, "that $foo" is usually
optimized into "that " . $foo, and the warning will refer to the
concatenation (.) operator, even though there is no . in
your program.
Use of uninitialized value $apache_proc_php in concatenation (.) or string at -
line 2366 (#1)
[ -- ] Your PHP Memory Limit (Per-Process) is .
Use of uninitialized value $apache_proc_php in string eq at - line 2367 (#1)
[ -- ] MySQL Detected => Using 1554.09 MB of memory.
[ OK ] No large log files were found in /var/log/apache2.
[ >> ] MaxClients has been hit recently (maximum of 5 results shown), consider the dates and times below:
[Tue Sep 17 02:01:26.796315 2019] [mpm_prefork:error] [pid 3161] AH00161: server reached MaxRequestWorkers setting, consider raising the MaxRequestWorkers setting
[Tue Sep 17 02:35:48.069254 2019] [mpm_prefork:error] [pid 6394] AH00161: server reached MaxRequestWorkers setting, consider raising the MaxRequestWorkers setting
[Tue Sep 17 02:56:25.712864 2019] [mpm_prefork:error] [pid 8685] AH00161: server reached MaxRequestWorkers setting, consider raising the MaxRequestWorkers setting
[Tue Sep 17 03:32:59.609792 2019] [mpm_prefork:error] [pid 12435] AH00161: server reached MaxRequestWorkers setting, consider raising the MaxRequestWorkers setting
[Tue Sep 17 03:52:05.692572 2019] [mpm_prefork:error] [pid 14833] AH00161: server reached MaxRequestWorkers setting, consider raising the MaxRequestWorkers setting
[ OK ] No PHP Fatal Errors were found.
Use of uninitialized value $_ in pattern match (m//) at - line 2172 (#1)
[ -- ] apache2 is currently using 932.67 MB of memory.
[ -- ] The smallest apache process is using 3.35 MB of memory
[ -- ] The average apache process is using 22.50 MB of memory
[ -- ] The largest apache process is using 29.73 MB of memory
[ OK ] Going by the average Apache process, Apache can potentially use 427.50 MB RAM:
Without considering services: 5.36 % of total installed RAM
Considering extra services: 6.66 % of remaining RAM
[ OK ] Going by the largest Apache process, Apache can potentially use 564.88 MB RAM:
Without considering services: 7.08 % of total installed RAM
Considering extra services: 8.79 % of remaining RAM
--------------------------------------------------------------------------------
### GENERAL FINDINGS & RECOMMENDATIONS ###
--------------------------------------------------------------------------------
Apache2buddy.pl report for server: ip-xxx-xxx-xxx-xxx.ec2.internal (xxx.xxx.xxx.xxx):
Settings considered for this report:
[ !! ] *** LOW UPTIME ***.
[ @@ ] The following recommendations may be misleading - apache has been restarted within the last 24 hours.
Your server's physical RAM: 7981 MB
Remaining Memory after other services considered: 6426 MB
Apache's MaxRequestWorkers directive: 19 <--------- Current Setting
Apache MPM Model: prefork
Largest Apache process (by memory): 29 MB
[ !! ] Vhost count exceeds MaxRequestWorkers limits!
[ !! ] Your MaxRequestWorkers setting is too low.
Your recommended MaxRequestWorkers setting is between 194 and 216. <------- Acceptable Range (10% of MAX)
Max potential memory usage: 564 MB
Percentage of TOTAL RAM allocated to Apache: 7.08 %
Percentage of REMAINING RAM allocated to Apache: 8.79 %
--------------------------------------------------------------------------------
A log file entry has been made in: /var/log/apache2buddy.log for future reference.
Last 5 entries:
2019/09/14 18:49:19 Uptime: "0d 01h 16m 17s" Model: "Prefork" Memory: "7981 MB" MaxRequestWorkers: "45" Recommended: "96" Smallest: "3.32 MB" Avg: "31.03 MB" Largest: "72.06 MB" Highest Pct Remaining RAM: "46.70%" (40.64% TOTAL RAM)
2019/09/15 01:53:51 Uptime: "0d 05h 02m 19s" Model: "Prefork" Memory: "7981 MB" MaxRequestWorkers: "45" Recommended: "71" Smallest: "3.32 MB" Avg: "53.25 MB" Largest: "96.15 MB" Highest Pct Remaining RAM: "63.15%" (54.22% TOTAL RAM)
2019/09/15 22:24:37 Uptime: "0d 0h 25m 22s" Model: "Prefork" Memory: "7981 MB" MaxRequestWorkers: "19" Recommended: "150" Smallest: "3.32 MB" Avg: "21.83 MB" Largest: "49.91 MB" Highest Pct Remaining RAM: "12.61%" (11.89% TOTAL RAM)
2019/09/16 05:47:19 Uptime: "0d 04h 24m 25s" Model: "Prefork" Memory: "7981 MB" MaxRequestWorkers: "19" Recommended: "96" Smallest: "8.30 MB" Avg: "43.31 MB" Largest: "69.24 MB" Highest Pct Remaining RAM: "19.68%" (16.49% TOTAL RAM)
2019/09/17 04:00:53 Uptime: "0d 0h 08m 55s" Model: "Prefork" Memory: "7981 MB" MaxRequestWorkers: "19" Recommended: "216" Smallest: "3.35 MB" Avg: "22.50 MB" Largest: "29.73 MB" Highest Pct Remaining RAM: "8.79%" (7.08% TOTAL RAM)
0 Answers