I use a server with these hardware:
Cpu: AMD Ryzen 9 5950X 16 cores
Ram: 128GB DDR4 ECC
NVMe SSD HD
There I have installed Ubuntu 18.04 LTS with latest kernel. No panels only the following services needed for my php application:
Nginx 1.19.10
Php7.4-fpm
Elasticsearch 7.11.2
RabbitMQ broker
Varnish 6.4.0
Redis cache
Percona mysql 8.0.22-13
I won't go in further details in how I have shared server resources to these services. I will say that all separated with the RAM is needed with the following priority:
Mysql
PHP FPM
NGINX
Varnish
Elasticsearch
and some little amount I need for Rabbit and Redis
After further monitor this server about 2 months now with same traffic( no significant changes) it seems that my php application runs more smoothly and faster about 1 hour after rebooting the server than when the server runs for a week without reboot.
After about 4-5 days of server running RAM goes up to 40-45GB in htop of the 128GB. And from this point it never goes more. I have never seen more than 45GB filling rate. Also CPU never goes further than 4.00 load.
In top command I have usually:
Tasks: 595 total, 1 running, 429 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.7 us, 0.1 sy, 0.0 ni, 99.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 13196481+total, 44963736 free, 27595196 used, 59405888 buff/cache
KiB Swap: 71303160 total, 71303160 free, 0 used. 10288868+avail Mem
In vmstat 10 4 now I have:
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 0 44972896 6995364 52411440 0 0 9 37 12 12 2 0 98 0 0
1 0 0 44972676 6995400 52411536 0 0 0 296 1667 2226 3 1 96 0 0
1 0 0 44970704 6995400 52411544 0 0 0 558 1186 1610 2 0 98 0 0
1 0 0 44965440 6995400 52411540 0 0 0 52 986 1555 0 0 100 0 0
And in free -m I have
total used free shared buff/cache available
Mem: 128871 26809 44046 262 58015 100615
Swap: 69631 0 69631
As I see most of RAM is for cache which is what I have set for mysql, php-fpm and nginx. And SWAP is hardly used. In top peaks I see about 3.00M in top only....I have seen there up to 6.00M
The question is why this server instead of behave better with caches filled up seems to do the opposite? When RAM is not filled it seems that is functioning faster.
Is there anything in ubuntu settings which needs further investigation?
Thank you in advance!
Relying on "manual" tests should be the first step but is error prone. Now you have to dig deeper and do some investigation using proper tools to obtain synthetic benchmark data.
From my experience I would suggest to:
This will provide you enough data to analyze and show us more details to help you. With current settings and after heavy benchmarking you should observe slower responses.
Take a look on Varnish statistics in interpret them, on mysql slowlog, varnish log (workers, queues, waiting workers etc.), Xms and Xmx settings for Elasticsearch, check Rabbit counters (queues) and more. This should tell you more about the root issue.
Last resort before implementing APM inside application is to check all metrics mentioned above for all used services. From my perspective implementing APM will be inevitable.