Re: CREATE INDEX and HOT - revised design

From: Hannu Krosing <hannu(at)skype(dot)net>
To: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Pavan Deolasee <pavan(dot)deolasee(at)enterprisedb(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: CREATE INDEX and HOT - revised design
Date: 2007-03-23 07:50:36
Message-ID: 1174636236.3826.20.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Ühel kenal päeval, N, 2007-03-22 kell 23:30, kirjutas Pavan Deolasee:
>
>
> On 3/22/07, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com> writes:
> > When CREATE INDEX starts, it acquires ShareLock on the
> table.
> > At this point we may have one or more HOT-update chains in
> the
> > table. Tuples in this chain may be visible to one or more
> running
> > transactions. The fact that we have ShareLock on the table
> means
> > that all tuples in the chain except the one at the head
> either
> > RECENTLY_DEAD or were UPDATEd by the same transaction
> > that is now running CREATE INDEX.
>
> This logic seems pretty questionable to me in view of the
> recent VACUUM
> FULL bug fixes. We now know that an update chain can contain
> tuples that
> appear DEAD but are later than ones that are
> RECENTLY_DEAD. How are
> you defining a HOT chain exactly --- will it be just a part of
> an
> update chain that consists only of contiguous tuples that
> appear live or
> RECENTLY_DEAD to the CREATE INDEX transaction?
>
>
> HOT-chain is something which contains contiguous tuples that share
> the same index keys for all the existing indexes and share a single
> index entry. It does not really matter if there are RECENTLY_DEAD
> tuples that appear before a DEAD tuple because from VACUUM FULL
> bug fix we know that they are DEAD too.
>
> My argument is that its enough to index only the LIVE tuple which
> is at the end of the chain if we don't use the new index for queries
> in transactions which were started before CREATE INDEX.

You mean, which were started before CREATE INDEX completes ?

Just wait for all concurrent transactions to complete before marking the
index as usable in plans.

> I am
> proposing to do that by storing an xid in the pg_index row.

I don't think it is a good idea to store xid's anywhere but in xmin/xmax
columns, as doing so would cause nasty xid wraparound problems.

Instead you should wait, after completeing the index , for all
concurrent transactions to end before you mark the index as "usable for
queries", similar to the way CREATE INDEX CONCURRENTLY does.

> A
> special case is where a tuple is UPDATEd multiple times by
> the same transaction which is also creating the index, in which case
> there are more than one LIVE versions of the tuple. But again
> we are safe by indexing only the latest version because all other
> versions would be invisible (even to us) once CREATE INDEX commits.
>
>
> > In fact, the serializable transactions started before CREATE
> INDEX
> > can not anyway see the index so all this is done to handle
> > read-committed transactions.
>
> You are laboring under an illusion that system catalog
> accesses are MVCC.
> SnapshotNow does not behave that way: the system can see the
> new index
> as soon as it's committed. (It had better, since it has to
> start
> updating the index immediately, whether it's safe to scan it
> or not.)
> I'm not sure whether that's fundamental to your argument or
> not, but
> it's certainly wrong.
>
>
> Oh, thanks for pointing that out. But thats certainly not fundamental
> to the argument as you probably already guessed. The xid still
> controls the usage of index for query planning, somewhat similar to
> "isindvalid" flag for CREATE INDEX CONCURRENTLY.

Xids are unstable and will come back to bite you after 2G transactions.

Why not just use the "isindvalid" flag ?

--
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me: callto:hkrosing
Get Skype for free: http://www.skype.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Hannu Krosing 2007-03-23 08:30:44 Re: CREATE INDEX and HOT - revised design
Previous Message Andrew - Supernews 2007-03-23 06:10:39 Re: LIKE optimization in UTF-8 and locale-C