I seem to be having a problem whereby a server running postgresql is experiencing rather high load with the average being above 30.
This server is running as a vm on vmware.
What I'm wondering is what is the best way to configure postgresql so that it doesn't overload the box.
If I were to throw more resources at the VM, would that solve the issue or would it just gobble up that as well. It is only the CPU that is getting hammered, the memory (2GB) is sufficient.
'High load' probably meand a lot of queries running in parallel. Try running the following query:
This will show what the server is doing at the moment. Maybe you can see something suspicious here? A query which should not be called that often?
Also, for debugging performance problems it is very useful to set the log_min_duration_statement option in postgresql.conf. This will show you the queries which take a lot of time to execute. These are probably those which cause most load. Usually it takes to optimize a single faulty query (by rewriting it, adding indexes or tuning PostgreSQL configuration) to lower server load a lot.
I can't give you a configuration file, because most settings depends on your environment, and "high load" is not exactly a single type of problem. But here are some tips for how to track down performance issues with PGSQL.
Well, first of all you will have to check what your database is actually doing. Typically when you have run away DB you have a few heavy queries that are frequently run. Check which queries you run, and then use EXPLAIN ANALYZE to get an idea of the cost of each query.
Next you start examine what is actually the problem. A badly written query can murder performance, but if the queries looks good (and can't be optimized) you will have to see if you can make it easier for the DB in other ways.
Check your EXPLAIN output for sequential scans, especially if you are doing it on larger tables and in several steps. Check if you can introduce indexes that will help with your queries. EXPLAIN is the general performance tracking tool for queries.
Don't forget to VACUM and ANALYSE your database. Many distributions disables the autovacum service, and then you have to do the vacum/analyse manually. They might also disable the accounting, which makes the analyse process much less efficient. After you do this, go back and explain again to see if the queries are run differently.
If nothing else helps you will have to rebuild your database structure or start distributing it, but my experience is that you won't need to do this unless you run a site with very heavy DB load or thousands of simultaneous users. It's usually queries that the database fails to retrieve in an optimal manner.
Finally check http://wiki.postgresql.org/wiki/Performance_Optimization for more tips.
Check that your tables have been vacuumed properly. High CPU usage on a postgres instance is usually a sign that there are a large number of deleted rows in the table.
If you are using auto vacuum, check the table statistics to check that it is actually running