Re: temporary tables, indexes, and query plans

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jon Nelson <jnelson+pgsql(at)jamponi(dot)net>, pgsql-performance(at)postgresql(dot)org
Subject: Re: temporary tables, indexes, and query plans
Date: 2010-11-14 02:54:58
Message-ID: AANLkTikeaQ98udD7hvmiHtUJQ9LYR0A9+yfsOucUnBc3@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Sat, Nov 13, 2010 at 7:54 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Sat, Nov 13, 2010 at 10:41 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> OK, this is an artifact of the "HOT update" optimization.  Before
>>> creating the index, you did updates on the table that would have been
>>> executed differently if the index had existed.  When the index does get
>>> created, its entries for those updates are incomplete, so the index
>>> can't be used in transactions that could in principle see the unmodified
>>> rows.
>
>> Is the "in principle" here because there might be an open snapshot
>> other than the one under which CREATE INDEX is running, like a cursor?
>
> Well, the test is based on xmin alone, not cmin, so it can't really tell
> the difference.  It's unclear that it'd be worth trying.

Yeah, I'm not familiar with the logic in that area of the code, so I
can't comment all that intelligently. However, I feel like there's a
class of things that could potentially be optimized if we know that
the only snapshot they could affect is the one we're currently using.
For example, when bulk loading a newly created table with COPY or
CTAS, we could set the xmin-committed hint bit if it weren't for the
possibility that some snapshot with a command-ID equal to or lower
than our own might take a look and get confused. That seems to
require a BEFORE trigger or another open snapshot. And, if we
HOT-update a tuple created by our own transaction that can't be of
interest to anyone else ever again, it would be nice to either mark it
for pruning or maybe even overwrite it in place; similarly if we
delete such a tuple it would be nice to schedule its execution. There
are problems with all of these ideas, and I'm not totally sure how to
make any of it work, but to me this sounds suspiciously like another
instance of a somewhat more general problem.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Marti Raudsepp 2010-11-14 08:30:37 Re: MVCC performance issue
Previous Message Tom Lane 2010-11-14 00:54:06 Re: temporary tables, indexes, and query plans