Re: CREATE INDEX and HOT - revised design

From: "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Bruce Momjian" <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, "Pavan Deolasee" <pavan(dot)deolasee(at)enterprisedb(dot)com>, "Simon Riggs" <simon(at)2ndquadrant(dot)com>
Subject: Re: CREATE INDEX and HOT - revised design
Date: 2007-03-22 18:00:20
Message-ID: 2e78013d0703221100t2aa183d4vdebaeaac07c6aee3@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 3/22/07, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com> writes:
> > When CREATE INDEX starts, it acquires ShareLock on the table.
> > At this point we may have one or more HOT-update chains in the
> > table. Tuples in this chain may be visible to one or more running
> > transactions. The fact that we have ShareLock on the table means
> > that all tuples in the chain except the one at the head either
> > RECENTLY_DEAD or were UPDATEd by the same transaction
> > that is now running CREATE INDEX.
>
> This logic seems pretty questionable to me in view of the recent VACUUM
> FULL bug fixes. We now know that an update chain can contain tuples that
> appear DEAD but are later than ones that are RECENTLY_DEAD. How are
> you defining a HOT chain exactly --- will it be just a part of an
> update chain that consists only of contiguous tuples that appear live or
> RECENTLY_DEAD to the CREATE INDEX transaction?

HOT-chain is something which contains contiguous tuples that share
the same index keys for all the existing indexes and share a single
index entry. It does not really matter if there are RECENTLY_DEAD
tuples that appear before a DEAD tuple because from VACUUM FULL
bug fix we know that they are DEAD too.

My argument is that its enough to index only the LIVE tuple which
is at the end of the chain if we don't use the new index for queries
in transactions which were started before CREATE INDEX. I am
proposing to do that by storing an xid in the pg_index row. A
special case is where a tuple is UPDATEd multiple times by
the same transaction which is also creating the index, in which case
there are more than one LIVE versions of the tuple. But again
we are safe by indexing only the latest version because all other
versions would be invisible (even to us) once CREATE INDEX commits.

> In fact, the serializable transactions started before CREATE INDEX
> > can not anyway see the index so all this is done to handle
> > read-committed transactions.
>
> You are laboring under an illusion that system catalog accesses are MVCC.
> SnapshotNow does not behave that way: the system can see the new index
> as soon as it's committed. (It had better, since it has to start
> updating the index immediately, whether it's safe to scan it or not.)
> I'm not sure whether that's fundamental to your argument or not, but
> it's certainly wrong.

Oh, thanks for pointing that out. But thats certainly not fundamental
to the argument as you probably already guessed. The xid still controls
the usage of index for query planning, somewhat similar to "isindvalid"
flag for CREATE INDEX CONCURRENTLY.

Thanks,
Pavan

--

EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2007-03-22 18:13:48 Re: "Relation not found" error but table exits.
Previous Message Tom Lane 2007-03-22 17:50:42 Re: Bug in CREATE/DROP TABLESPACE command