Re: vacuum, performance, and MVCC

From: "Mark Woodward" <pgsql(at)mohawksoft(dot)com>
To: "Bruce Momjian" <bruce(at)momjian(dot)us>
Cc: "Heikki Linnakangas" <hlinnaka(at)iki(dot)fi>, "Jan Wieck" <janwieck(at)yahoo(dot)com>, "Hannu Krosing" <hannu(at)skype(dot)net>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Csaba Nagy" <nagy(at)ecircle-ag(dot)com>, "Jonah H(dot) Harris" <jonah(dot)harris(at)gmail(dot)com>, "Christopher Browne" <cbbrowne(at)acm(dot)org>, "postgres hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: vacuum, performance, and MVCC
Date: 2006-06-26 13:17:26
Message-ID: 18329.24.91.171.78.1151327846.squirrel@mail.mohawksoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> Heikki Linnakangas wrote:
>> On Mon, 26 Jun 2006, Jan Wieck wrote:
>>
>> > On 6/25/2006 10:12 PM, Bruce Momjian wrote:
>> >> When you are using the update chaining, you can't mark that index row
>> as
>> >> dead because it actually points to more than one row on the page,
>> some
>> >> are non-visible, some are visible.
>> >
>> > Back up the truck ... you mean in the current code base we have heap
>> tuples
>> > that are visible in index scans because of heap tuple chaining but
>> without
>> > index tuples pointing directly at them?
>>
>> In current code, no. Every heap tuple has corresponding index tuples.
>>
>> In Bruce's proposal, yes. You would have heap tuples without index
>> tuples
>> pointing directly at them. An index scan could only find them by
>> following
>> t_ctid chains.
>>
>> Correct me if I understood you incorrectly, Bruce.
>
> Correct! We use the same pointers used by normal UPDATEs, except we set
> a bit on the old tuple indicating it is a single-index tuple, and we
> don't create index entries for the new tuple. Index scan routines will
> need to be taught about the new chains, but because only one tuple in
> the chain is visible to a single backend, the callers should not need to
> be modified.
>
> (All tuples in the chain have page item ids. It is just that when they
> are freed, the pointers are adjusted so the index points to the chain
> head.)
>
> One problem is that once you find the row you want to update, it is
> difficult to see if it is part of a single-index chain because there are
> only forward pointers, so I think we have to scan the entire page to
> find the chains. To reduce that overhead, I am thinking we free the
> non-visible tuples only when the page has no more free space. This
> allows us to free not just our own non-visible tuples, but perhaps
> others as well.

This sort of incorporates the "vacuum row" I suggested.

>
> We have never been able to free non-visible tuples before because of
> index cleanup overhead, but with single-index chains, we can, and reduce
> the requirements of vacuum for many workloads.
>

This is great!

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2006-06-26 13:18:16 Re: vacuum, performance, and MVCC
Previous Message Mark Woodward 2006-06-26 13:10:03 Re: vacuum, performance, and MVCC