Jake

Asked: 2012-07-05 17:29:33 +0800 CST2012-07-05 17:29:33 +0800 CST 2012-07-05 17:29:33 +0800 CST

Postgres 9.0 locking up, 100% CPU

772

We are having a problem where our Postgres 9.0 server occasionally locks up and kills our webapp. Restarting Postgres fixes the problem.

Here's what I've been able to observe:

First, usage of one CPU jumps to 100% for a few minutes
- Disk operations drop to ~0 during this time
- Database operations drop to 0 (blocks and tuples per sec)
- Logs show during this time:
  - WARNING: worker took too long to start; cancelled
  - WARNING: worker took too long to start; cancelled
  - No Queries in logs (only those over 200ms are logged)
- No unusually long-running queries logged before or during
Then the second CPU jumps to 100%
- The number of postgres processes jumps from the usual 8-10 to ~20
- Matched by a spike in Postgres Blocks per second (about twice normal)
- Logs show
  - LOG: could not accept SSL connection: EOF detected
  - Queries are running but slow
Restarting postgres returns everything to normal

Setup:

Server: Amazon EC2 Large
Ubuntu 10.04.2 LTS
Postgres 9.0.3
Dedicated DB server

Does anyone have any idea what's causing this? Or any suggestions about what else I should be checking out?

1 Answers

Voted

Mike Shultz
2015-06-02T19:21:43+08:002015-06-02T19:21:43+08:00
Make sure you are not running out of memory and causing disk thrashing issues.

If you have plenty of open memory, then go directly into PostgreSQL and look for an offending query.

SELECT * FROM pg_stat_activity;
0

Web Analytics Made Easy - Statcounter