Re: Single Index Tuple Chain (SITC) method

From: Hannu Krosing <hannu(at)skype(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Greg Stark <gsstark(at)mit(dot)edu>, PFC <lists(at)peufeu(dot)com>
Subject: Re: Single Index Tuple Chain (SITC) method
Date: 2006-06-28 22:39:51
Message-ID: 1151534391.2897.14.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Ühel kenal päeval, K, 2006-06-28 kell 18:19, kirjutas Tom Lane:
> Bruce Momjian <bruce(at)momjian(dot)us> writes:
> > Here is an overview of the SITC method:
> > http://momjian.us/cgi-bin/pgsitc
>
> A pretty fundamental problem is that the method assumes it's OK to
> change the CTID of a live tuple (by swapping its item pointer with some
> expired version). It is not --- this will break:
> * active UPDATEs and DELETEs that may have fetched the CTID
> but not yet completed processing to decide whether to change
> the tuple;
> * pending AFTER ROW triggers, such as foreign key checks;
> * ODBC as well as other applications that assume CTID is a
> usable unique row identifier within transactions.

We should *always* return the ctid of CITC head, as this is the one that
does not change.

And anyway, ctid is a usable unique row identifier only within read-only
transactions, or not ?

> VACUUM FULL can get away with moving tuples to new CTIDs because it takes
> AccessExclusiveLock, so there can be no open transactions with knowledge
> of current CTIDs in the table. This is not OK for something that's
> supposed to happen in plain UPDATEs, though.

Would it still be a problem, if we *always* refer to the whole CITC
chain by its externally visible ctid, an look up the real tuple inside
tuple fetch op at every access.

(1) If we had some special bits for tuples at CITC chain head and inside
CITC but not at head, then even seqscan can ignore non-head CITC chain
members at its find next tuple op and do the real tuple lookup in some
inner function when it hits CITC head.

Is it correct to assume, that only one row version can be in process of
being modified at any one time?

> Another problem is you can't recycle tuples, nor item ids, without
> taking a VACUUM-style lock on the page (LockBufferForCleanup). If
> anyone else is holding a pin on the page they risk getting totally
> confused --- for instance, a seqscan will either miss a tuple or scan it
> twice depending on which direction you're juggling item ids around it.

I think (1) above solves this, at cost of looking twice at CITC internal
tuple headers.

> The concurrency loss involved in LockBufferForCleanup is OK for
> background-maintenance operations like VACUUM, but I seriously doubt
> anyone will find it acceptable for UPDATE. It could easily create
> application-level deadlocks, too. (VACUUM is safe against that because
> it only holds one lock.)

Tom - what do you think of the other related idea, that of reusing dead
index entries ?

--
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me: callto:hkrosing
Get Skype for free: http://www.skype.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2006-06-28 23:38:14 Re: Instability in TRUNCATE regression test
Previous Message Tom Lane 2006-06-28 22:30:34 Re: Instability in TRUNCATE regression test