Re: temporary tables, indexes, and query plans

From: Jon Nelson <jnelson+pgsql(at)jamponi(dot)net>
To:
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: temporary tables, indexes, and query plans
Date: 2010-11-13 03:31:55
Message-ID: AANLkTikS2_v7KGj3yafJ6pjTSLeOTkcaBhg14Am7X_OE@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Thu, Oct 28, 2010 at 9:23 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Jon Nelson <jnelson+pgsql(at)jamponi(dot)net> writes:
>> On Wed, Oct 27, 2010 at 5:36 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> It thinks it's faster, or there is some reason why it *can't* use the
>>> index, like a datatype mismatch.  You could tell which by trying "set
>>> enable_seqscan = off" to see if that will make it change to another
>>> plan; if so, the estimated costs of that plan versus the original
>>> seqscan would be valuable information.
>
>> When I place the index creation and ANALYZE right after the bulk
>> update, follow it with 'set enable_seqscan = false', the next query
>> (also an UPDATE - should be about 7 rows) results in this plan:
>
>> Seq Scan on foo_table  (cost=10000000000.00..10000004998.00 rows=24 width=236)
>
> OK, so it thinks it can't use the index.  (The "cost=10000000000" bit is
> the effect of enable_seqscan = off: it's not possible to just never use
> seqscans, but we assign an artificially high cost to discourage the
> planner from selecting them if there's any other alternative.)
>
> So we're back to wondering why it can't use the index.  I will say
> once more that we could probably figure this out quickly if you'd
> post an exact example instead of handwaving.

OK. This is a highly distilled example that shows the behavior.
The ANALYZE doesn't appear to change anything, nor the SET STATISTICS
(followed by ANALYZE), nor disabling seqential scans. Re-writing the
table with ALTER TABLE does, though.
If the initial UPDATE (the one before the index creation) is commented
out, then the subsequent updates don't use sequential scans.

\timing off
BEGIN;
CREATE TEMPORARY TABLE foo AS SELECT x AS A, chr(x % 75 + 32) AS b,
''::text AS c from generate_series(1,500) AS x;
UPDATE foo SET c = 'foo' WHERE b = 'A' ;
CREATE INDEX foo_b_idx on foo (b);

-- let's see what it looks like
EXPLAIN UPDATE foo SET c='bar' WHERE b = 'C';

-- does forcing a seqscan off help?
set enable_seqscan = false;
EXPLAIN UPDATE foo SET c='bar' WHERE b = 'C';

-- what about analyze?
ANALYZE VERBOSE foo;
EXPLAIN UPDATE foo SET c='bar' WHERE b = 'C';

-- what about statistics?
ALTER TABLE foo ALTER COLUMN b SET STATISTICS 10000;
ANALYZE VERBOSE foo;
EXPLAIN UPDATE foo SET c='bar' WHERE b = 'C';

-- let's re-write the table
ALTER TABLE foo ALTER COLUMN a TYPE int;
EXPLAIN UPDATE foo SET c='bar' WHERE b = 'C';

ROLLBACK;

--
Jon

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Cédric Villemain 2010-11-13 05:44:25 Re: anti-join chosen even when slower than old plan
Previous Message Robert Haas 2010-11-12 21:07:47 Re: questions regarding shared_buffers behavior