Re: vacuum, performance, and MVCC

From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
To: Mark Woodward <pgsql(at)mohawksoft(dot)com>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, "Jonah H(dot) Harris" <jonah(dot)harris(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Csaba Nagy <nagy(at)ecircle-ag(dot)com>, Hannu Krosing <hannu(at)skype(dot)net>, Christopher Browne <cbbrowne(at)acm(dot)org>, postgres hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: vacuum, performance, and MVCC
Date: 2006-06-25 17:02:05
Message-ID: 449EC18D.2000605@Yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 6/25/2006 6:52 AM, Mark Woodward wrote:
>> On 6/24/2006 9:23 AM, Mark Woodward wrote:
>>
>>>> On Sat, 24 Jun 2006, Mark Woodward wrote:
>>>>
>>>>> I'm probably mistaken, but aren't there already forward references in
>>>>> tuples to later versions? If so, I'm only sugesting reversing the
>>>>> order
>>>>> and referencing the latest version.
>>>>
>>>> I thought I understood your idea, but now you lost me again. I thought
>>>> what you want is that the older heap tuple has a pointer to the
>>>> newer one. Which it already has, it's called t_ctid.
>>>
>>> Perfect!
>>>>
>>>> Can you try to explain more carefully how the whole thing would work?
>>>> What would an index tuple point to? What pointers would a heap tuple
>>>> have? What would an index scan do to find the row version it's
>>>> interested
>>>> in? What exactly would an update do?
>>>
>>>
>>> Since we already allocate space for some notion of linked list, then all
>>> I'm suggesting is we reverse the order, sort of. Currently it looks like
>>> this:
>>>
>>> ver001->ver002->ver003->...-verN
>>>
>>> That's what t_ctid does now, right? Well, that's sort of stupid. Why not
>>> have it do this:
>>>
>>> ver001->verN->...->ver003->ver002->|
>>> ^---------------------------------/
>>>
>>> This will speed up almost *all* queries when there are more than two
>>> version of rows.
>>>
>>> OK, here is the behavior of an update:
>>> (1) Find the latest version of the row
>>> (2) Duplicate row and modify as per plan
>>> (3) Set the t_ctid of the new row to the last "latest"
>>> (4) Set the t_ctid of the first row to that of the new row
>>> (5) Attempt to index the row
>>> (6) If the first version of the row is in the index already (ver001)
>>> Don't
>>> modify the index, otherwise, add the new version (just as before)
>>>
>>> When you vacuum, simply make the latest version (verN) the key row
>>> (ver001).
>>
>> This isn't done "simply". Currently, vacuum collects a trivial array of
>> ctid's it is removing and every now and then does a bulk remove of the
>> index tuples pointing to them. Now lets consider a table with two
>> indexed columns with the following row versions resulting from an insert
>> and 3 updates to that same row:
>>
>> v1: a,b
>> v2: a,c
>> v3: a,d
>> v4: b,d
>>
>> In your new scheme, there would be two index tuples for column 1 (one
>> pointing to v1, one pointing to v4) and 3 index tuples for column 2 (one
>> for each different value pointing to v1, v2 and v3). Did I get that
>> right so far?
>>
>> If vacuum now can remove v1, it has to update index 1 to point to v2 and
>> remove the pointer to v1 from index 2. If it can remove v1 and v2, it
>> has to update index 1 to point to v3 and remove v1 and v2 from index 2.
>> If it can remove v1, v2 and v3 it must delete the index 1 tuple pointing
>> to v1, delete the index 2 entries pointing to v1 and v2 and update the
>> index 2 entry for v3 to point to v4. Figuring out which index tuples to
>> remove and which ones to update can only be done by comparing each and
>> every indexed columns old and new values. To do so, vacuum will have to
>> fetch all the row versions, which can be scattered all over the place,
>> with all possible locking issues including but not limited to deadlocks.
>
> I'm not sure why vacuum can't run similarly to the way it does now.

What's that supposed to mean?

Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck(at)Yahoo(dot)com #

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jan Wieck 2006-06-25 17:09:35 Re: vacuum, performance, and MVCC
Previous Message Jonah H. Harris 2006-06-25 16:29:02 Re: vacuum row?