Re: temporary tables, indexes, and query plans

From: Jon Nelson <jnelson+pgsql(at)jamponi(dot)net>
To:
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: temporary tables, indexes, and query plans
Date: 2010-10-27 20:29:13
Message-ID: AANLkTing+x8XCwGfzkoWUtoqtBBk0gv9eOVPqQ=O36Dn@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Wed, Oct 27, 2010 at 2:43 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Jon Nelson <jnelson+pgsql(at)jamponi(dot)net> writes:
>> The most recent experiment shows me that, unless I create whatever
>> indexes I would like to see used *before* the large (first) update,
>> then they just don't get used. At all.
>
> You're making a whole lot of assertions here that don't square with
> usual experience.  I think there is some detail about what you're
> doing that affects the outcome, but since you haven't shown a concrete
> example, it's pretty hard to guess what the critical detail is.

First, let me supply all of the changed (from the default) params:

default_statistics_target = 500
maintenance_work_mem = 240MB
work_mem = 256MB
effective_cache_size = 1GB
checkpoint_segments = 128
shared_buffers = 1GB
max_connections = 30
wal_buffers = 64MB
shared_preload_libraries = 'auto_explain'

The machine is a laptop with 4GB of RAM running my desktop. Kernel is
2.6.36, filesystem is ext4 (for data) and ext2 (for WAL logs). The
disk is a really real disk, not an SSD.

The sequence goes exactly like this:

BEGIN;
CREATE TEMPORARY TABLE (20 columns, mostly text, a few int).
COPY (approx 8 million rows, ~900 MB)[1]
UPDATE (2.8 million of the rows)
UPDATE (7 rows)
UPDATE (250 rows)
UPDATE (3500 rows)
UPDATE (3100 rows)
a bunch of UPDATE (1 row)
...

Experimentally, I noticed that performance was not especially great.
So, I added some indexes (three indexes on one column each). One index
is UNIQUE.
The first UPDATE can't use any of the indexes. The rest should be able to.

In my experiments, I found that:

If I place the index creation *before* the copy, the indexes are used.
If I place the index creation *after* the copy but before first
UPDATE, the indexes are used.
If I place the index creation at any point after the first UPDATE,
regardless of whether ANALYZE is run, the indexes are not used (at
least, according to auto_analyze).

Does that help?

[1] I've been saying 10 million. It's really more like 8 million.
--
Jon

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Kenneth Marshall 2010-10-27 20:41:15 Re: Select count(*), the sequel
Previous Message Kevin Grittner 2010-10-27 20:11:16 Re: Select count(*), the sequel