I'm looking for resources on how to grow our server setup.
We currently have one dedicated server with Rackspace in the UK of the following spec:
HPDL385_G2_PrevGen
HP Single Dual Core Opteron 2214 (2.2Ghz)
4GB RAM
2x 10,000 SCSI Drives in RAID 1
Our traffic is up to 550,000 UVs per month.
The site runs off a PHP and MySQL setup. The database gets an absolute hammering, we have many complex queries joining multilpe tables.
We are using APC for PHP caching.
I'm getting to the stage where I've done as much DB and query optimisation as I can and wonder what the next step should be......
I've looked at memcache, but I've got the impression that his requires a large amount of RAM and ideally a dedicated box....
So is the next step to have two boxes; one for database, one for Apache? Or is there a step I've overlooked.
Our load is usually around the 2 mark, but right now it's up at 20!
Some graphs from Munin:
Buy some hardware, but put it in your test lab not in the data centre. Then stress your application on various hardware/ software combinations until you find a reasonable one which will do what you want.
Of course you'll need to engineer something which can create fake traffic against a production-like database running a test copy of your app. But who said it would be easy.
If you don't do this and simply do some stuff in production, you've no idea whether it's going to work or not, and you may have spent a LOT of engineering effort implementing things like caches (which will come with their fair share of bugs!) on something that doesn't help.
Test, test, and test more. Don't throw hardware/software changes into production until you've got good performance data showing that it is likely to improve matters significantly. Engineering effort is expensive, test hardware is not (particularly).
Memcached is just one option, and you need probably not consider it until you've got the database's caching working optimally. This means putting it on a dedicated (64-bit, of course) box with a reasonable amount of RAM (not 4G - laptops have that nowadays; 32G is definitely affordable) and tuning it appropriately.
You've not mentioned how big your database is, but if it's at all feasible, you will want to try to get it entirely in ram (or at least the hot bits). Getting your database entirely in ram will make read IO operations disappear entirely and hence stop being a bottleneck.
Profile your database queries. There are tools knocking around for doing this - you should be able to simulate production load in your test environment. The trick is to avoid slow queries and ensure that the frequently-executed ones are fast.
If your performance problems are related to IO syncs, because you're just doing too many transactions for the database, ensure that you're using a battery backed raid controller that is behaving right (talk to your vendor about them). They give a lot more IO write operations than a non-battery backed one (because the data only need to get to the cache before the OS gets the confirmation). Alternatively, if your data don't matter that much, consider relaxing the database's durability parameters (innodb sync on commit).
By looking into caching solutions, as many others have suggested here, you can expect to end up with about 10% of the load you have today, maybe less.
However, this depends on what kind of services you run on your machine. You can do a lot with memcached without very much RAM.
You should try to profile which database queries take the longest, either by using MySQL's slow query log (or the equivalent for your database), or by using a tool such as mytop. Also, MySQL's
EXPLAIN SELECT
syntax may be helpful.Caching the results of a few selected MySQL queries (even for just a short period of time) can really improve your performance a lot.
I do a lot of performance and scale out work and what I've discovered is that:
Every application load is unique
Generic responses such as add more ram, get another server, do y, try x are often lessons in frustration and leave to convoluted setups.
Measure the Right Things
One of the biggest challenges is in determining what benchmarks are important. This often requires a step back and you have to put yourself in your client's shoes. Sometimes, simplistic site design changes and mean huge benefits to the web visitor. This is why I like tools like YSlow! which focus more on the end-user's experience rather than the server level. Once you decide what the right benchmark is for your site, then you can begin tuning. Benchmarks may be total page load time, total page size, cache effectiveness, site latency, etc. You have to pick the one that makes sense for your application.
Nuts and Bolts
One you are tracking the right benchmarks, start at a very low level. I like to use sysstat. You can get a wealth of information from sysstat and help you tease apart which system may be limiting overall application performance. Generally, I boil performance issues into:
Using sysstat and other tools, you can begin to split hairs and find the system that is limiting performance.
For example, I've seen highly loaded servers fail due to how their application was configured. Poor caching, lack of expires headers on static content, using HTTP vs. file includes, etc. all contributed to poor application performance. Fixing these application issues required no hardware changes. In other cases, I've seen the disks maxed out despite tons of caching. Moving to faster disks fixed the issue.
Rinse and Repeat
Often during application tuning, you will uncork one bottleneck to only find another one. This is why I recommend trying to monitor whatever it is you are tuning.
For example, say you fix a disk IO issue but your app is still slow. You may think you have wasted your efforts, but what's happen is you've simple hitting another bottleneck. By monitoring disk IO carefully, you can be sure you are improving disk IO even if your important application performance monitors are not changing.
Get the Right Tools
Make sure you are using the right tools for the job. Monitoring, testing, benchmarking, profiling, and other optimization techniques all have a variety of tools. Find the tool that best matches your situation best.
Rules of Thumb
While every app is unique, I do find some standard starting points:
Your Next Steps
If you don't find your bottleneck, adding a server may not help much. To solve disk IO you may need another server or SAN. If you have a ram bottleneck another server will solve the issue only in that it adds more RAM. Pretty costly move compared to just adding more RAM to your existing server.
Quick Fix
Over deploy. I've had to do this when it appears the application stack is the problem. Basically load up on CPU, RAM and disk IO (RAID 10, 15K SCSI or SSD). Go big on the hardware and then start tuning. This keeps you afloat until you solve the problems.
I would say the next step should be caching (data caching and/or page caching depending on your functionality). If memcached seems too complex, you can start with simple data caching solutions like PEAR Cache Lite which requires just couple lines of code but could make huge difference. Page (or page parts) caching is supported by Smarty template engine for example.
Once caching doesn't cut it anymore, then you can increase server amount as there is nothing else left.
If you have enough free RAM, memcached will help you even on the same box. Try to cache several most heavy queries and to see what will happen. Also, Apache is too heavy, use nginx or lighttpd instead (with PHP application working through FastCGI, see php-fpm).
Start caching, but ignore MySQL for the moment. Seriouosly.
The rule should be - stop a request AS EARLY AS POSSIBLE. So, a reverse proxy or proper Apache level caching is going to get you the best results, then sql level result caching WITHIN the application, then sql level caching ;)
The earlier you stop a request, the less overhead you have. Output cache level - not even PHP needs to run, so to speak.