Re: vacuum, performance, and MVCC

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: "Jonah H(dot) Harris" <jonah(dot)harris(at)gmail(dot)com>
Cc: Mark Woodward <pgsql(at)mohawksoft(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Csaba Nagy <nagy(at)ecircle-ag(dot)com>, Hannu Krosing <hannu(at)skype(dot)net>, Christopher Browne <cbbrowne(at)acm(dot)org>, postgres hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: vacuum, performance, and MVCC
Date: 2006-06-24 09:40:23
Message-ID: Pine.OSF.4.61.0606241215230.44498@kosh.hut.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 23 Jun 2006, Jonah H. Harris wrote:

> On 6/23/06, Mark Woodward <pgsql(at)mohawksoft(dot)com> wrote:
>> Rather than keep references to all versions of all
>> rows in indexes, keep only a reference to the first or "key" row of each
>> row, and have the first version of a row form the head of a linked list to
>> subsequent versions of each row. The list will be in decending order.
>
> By all means, please go ahead and try it because it's not quite that
> easy. You're going to run into serious locking and contention issues
> this way. In the end, it's not much better than running a sequential
> scan to query a row that's been updated several thousand times on a
> table that hasn't been vacuumed... follow that pointer :)

Can you elaborate what kind of locking and contention issues you're
thinking of?

You could update the index tuple to point to a newer version of the row,
when an index scan determines that the heap tuple it points to is not
visible to anyone. We already check that to update the XMAX_COMMITTED hint
bit. Updating the index tuple comes with a cost, of course, but
alleviates the "follow that pointer" issue.

The biggest challenge that I see is that an index scan would somehow
need to know when to follow the t_ctid chain and when not. If you follow
the pointer and there's another index tuple for the row, the scan could
see the same tuple twice. Some kind of bookkeeping would be needed to
solve that.

Also, vacuuming would become a bit more complex, since it would need to
update the index tuples to point to newer row versions instead of just
removing them.

All in all, I think this solution to the "an update needs to update all
indexes, even when none of the indexed columns changed" issue requires
less changes than implementing Oracle style rollback segments and/or an
undo log.

- Heikki

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Volkan YAZICI 2006-06-24 11:45:33 Re: libpq Describe Extension [WAS: Bytea and perl]
Previous Message mark 2006-06-24 08:32:20 Re: vacuum, performance, and MVCC