I'm using Postgres 9.X on a recent Ubuntu (Debian) Server and have been searching for maintenance script examples and unfortunately come up empty.
The Postgres docs are comprehensive on the abstract subject of maintenance -- Its databases should periodically be:
- vacuumed / analyzed (+autovacuum)
- reindexed
- clustered?
- logs rotated
Unfortunately the docs don't get to a concrete example, perhaps because of the many variables involved. Still, I'd appreciate an example of a baseline maintenance script:
- One for a small (both size & usage) database that expects to grow over the next few years.
- A bit of scheduled downtime is fine. For example, once a week, client apps will shut down for a while so postgres can do its thing.
I've started with this, please help to improve it:
#!/bin/bash
# this script maintains a Postgres Database Cluster
# set any needed env vars:
# PGDATABASE, PGUSER, PGPASSWORD, PGHOST, PGPORT, etc.
read -p "Press ENTER to stop Postgres Clients, start maintenance..."
sudo stop websrvr1
sudo stop websrvr2
sleep 2
echo -e "\nStarting backup ..."
# another script for another question
echo -e "\nStarting maintenance ..."
# What is the best order for these?
sudo -u postgres reindexdb --all --system
sudo -u postgres clusterdb --all
sudo -u postgres vacuumdb --all --analyze # --full # warned against
# log rotation? Supposed to be handled automatically,
# but I not yet seen more than one log file.
echo Complete. Type ... to start services.
Autovacuum is on by default. Reindexing and clustering aren't really baseline maintenance tasks. You can do them when needed to improve performance or (reindex) to fix index corruption, but they are not something you would blindly do regularly. Log rotation is taken care of automatically by the Debian packaging framework. So as a baseline, you don't need to do any of these things yourself.
What you should actually be doing is setting up backups (for a start, run
pg_dumpall
daily from a cron job), and monitoring (something like Nagios and Munin, perhaps, for a start). And then based on such monitoring you will regularly tune server settings and try to improve query performance.I think vacuuming should have some threshold — say > 30 % of wasted DB space/rows. This would save lots of time and meaningless efforts.