Ping a Specific Port

Question

ehsanul

Asked: 2010-09-15 18:59:39 +0800 CST2010-09-15 18:59:39 +0800 CST 2010-09-15 18:59:39 +0800 CST

Bad performance for a large join query in PostgreSQL 8.4.4, despite enough memory to cache entirely

772

Question

How can I make the query described in this post faster, in particular by making PostgreSQL using the available RAM? - Note that I have tried to configure effective_cache_size and shared_buffers appropriately. See below.

Background

I have to regularly join a ~260 million row table (coreg_master) with new data that comes in. I've partitioned the table to allow each partition to fit in RAM. I do have appropriate indexes set up too of course. However, when joining the partitioned tables separately against the other (much smaller) tables, it's doing fully random IO on the disk. This is due to a nested loop index scan on the large table, which is really slow as we don't have a great disk setup.

I'd expect it to use all available RAM to cache the large partitioned table instead, which I understand should be done by the Linux kernel/filesystem itself. But it still doesn't load the table into RAM, though it would fit. I guess that's because the access pattern isn't sequential and thus doesn't trigger caching? I have no idea. Query plan and configuration parameters below.

Table Structure

This is one of the partitions of coreg_master, my large table. The partitioned tables are named coreg_a, coreg_b, etc.

\d coreg_a
                                   Table "public.coreg_a"
   Column    |       Type        |                         Modifiers                         
-------------+-------------------+-----------------------------------------------------------
 id          | integer           | not null default nextval('coreg_master_id_seq'::regclass)
 first_name  | character varying | 
 last_name   | character varying | 
 phone       | character varying | 
 city        | character varying | 
 zip         | integer           | 
 address     | character varying | 
 dob         | date              | 
 ip          | character varying | 
 source      | character varying | 
 gender      | character varying | 
 state       | character varying | 
 record_date | date              | 
 email       | character varying | 
Indexes:
    "coreg_a_name" btree (lower(first_name::text), lower(last_name::text))
Check constraints:
    "coreg_a_first_name_check" CHECK (first_name::text >= 'a'::text AND first_name::text < 'b'::text)
Inherits: coreg_master

The following is a partition of table appendable_24, an example of the table being joined with coreg_master. It is also partitioned the same way as coreg_master, so actually coreg_a is joined with appendable_24_a, etc, one at a time.

\d appendable_24_a
       Table "public.appendable_24_a"
   Column   |       Type        | Modifiers 
------------+-------------------+-----------
 line_num   | integer           | not null
 first_name | character varying | 
 last_name  | character varying | 
 address    | character varying | 
 state      | character varying | 
 zip        | integer           | 
Indexes:
    "appendable_24_a_name_index" btree (lower(first_name::text), lower(last_name::text))
Check constraints:
    "appendable_24_a_first_name_check" CHECK (first_name::text >= 'a'::text AND first_name::text < 'b'::text)
Inherits: appendable_24

Query & EXPLAIN ANALYZE

Below is the output of explain analyze for the smallest of the joins (tables are partitioned according to the first letter of the first_name column), as it doesn't take ages. However, the query plan is the same for all the joins on each partition, so it should be representative of the larger joins as well (note, i did ANALYZE the tables, and the total time was actually 20 seconds, but faster here as the result was cached):

explain analyze SELECT
          coreg_x.phone,
          coreg_x.email,
          coreg_x.record_date,
          appendable_24_x.line_num
        FROM appendable_24_x INNER JOIN coreg_x ON
          lower(appendable_24_x.first_name) = lower(coreg_x.first_name) AND
          lower(appendable_24_x.last_name) = lower(coreg_x.last_name) AND
          (coreg_x.phone IS NOT NULL OR coreg_x.email IS NOT NULL) AND
          similarity(lower(appendable_24_x.address), lower(coreg_x.address)) > 0.7
      ; 
                                                                                       QUE
RY PLAN 

------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------
-----
 Nested Loop  (cost=0.01..640.49 rows=875 width=39) (actual time=9.990..53.839 rows=29 loo
ps=1)
   Join Filter: (similarity(lower((appendable_24_x.address)::text), lower((coreg_x.address
)::text)) > 0.7::double precision)
   ->  Seq Scan on appendable_24_x  (cost=0.00..1.80 rows=80 width=34) (actual time=0.009.
.0.111 rows=80 loops=1)
   ->  Index Scan using coreg_x_name on coreg_x  (cost=0.01..7.95 rows=1 width=64) (actual
 time=0.024..0.137 rows=44 loops=80)
         Index Cond: ((lower((coreg_x.first_name)::text) = lower((appendable_24_x.first_na
me)::text)) AND (lower((coreg_x.last_name)::text) = lower((appendable_24_x.last_name)::tex
t)))
         Filter: ((coreg_x.phone IS NOT NULL) OR (coreg_x.email IS NOT NULL))
 Total runtime: 53.950 ms
(7 rows)

Some stats, configuration parameters and other data

PostgreSQL Version: 8.4.4
OS: CentOS release 5.5 (Final)
Filesystem: ext3
Total available RAM: 8GB
shared_buffers = 2GB
effective_cache_size = 7200MB
full runtime configuration through show all: http://pastie.org/1159746
largest partitioned coreg table (coreg_j) size: ~4900MB
corresponding number of rows: ~32 million
corresponding (first_name, last_name) index size: ~1000MB
second partitioned table (appendable_24_j) size: ~1800kB
rows in appendable_24_j: ~25,000

1 Answers

Voted

eevar · Answer 1 · 2010-09-20T10:45:38+08:00

eevar

2010-09-20T10:45:38+08:002010-09-20T10:45:38+08:00

There's only enough ram to cache a fraction of your db, and as the query plan you posted shows, previously accessed data/portions of the index in question are indeed cached. (Postgres does not cache query results).

53 ms runtime isn't too shabby, and I'm not sure 20 sec on uncached data means that PG picked a bad query plan either. After all, just the index in question is 1GB large, but it would be interesting to see analyze output for a slow query.

You could try adjusting planner costs to see if that has any effect on worst-case performance, if that's your issue.

You might also want to increase your maintentance_work_mem a bit, even if that's in no way related.

2

Bad performance for a large join query in PostgreSQL 8.4.4, despite enough memory to cache entirely

Ping a Specific Port

How do I tell Git for Windows where to find my private RSA key?

How do you restart php-fpm?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Resolve host name from IP address

How can I sort du -h output by size

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?