Re: Multixid hindsight design

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Subject: Re: Multixid hindsight design
Date: 2015-05-12 07:37:44
Message-ID: 5551ADC8.8010404@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 05/12/2015 01:51 AM, Tom Lane wrote:
> Heikki Linnakangas <hlinnaka(at)iki(dot)fi> writes:
>> So the lesson here is that having a permanent pg_multixact is not nice,
>> and we should get rid of it. Here's how to do that:
>
> That would be cool, but ...
>
>> Looking at the tuple header, the CID and CTID fields are only needed,
>> when either xmin or xmax is running. Almost: in a HOT-updated tuple,
>> CTID is required even after xmax has committed, but since it's a HOT
>> update, the new tuple is always on the same page so you only need the
>> offsetnumber part.
>
> I think this is totally wrong. HOT update or not, you need the forward
> link represented by ctid not just until xmin/xmax commit, but until
> they're in the past according to all active snapshots. That's so that
> other transactions can chain forward to the "current" version of a tuple
> which they found according to their snapshots.
>
> It might be you can salvage the idea anyway, since it's still true that
> the forward links wouldn't be needed after a crash and restart. But the
> argument as stated is wrong.

Ah yes, I stated that wrong. What I meant was that they are not needed
after xmin and xmax are older than global xmin.

> (There's also the question of forensic requirements, although I'm aware
> that it's barely worth bringing that up since nobody else here seems to
> put much weight on that.)

I do care about that. In this scheme, you would always have the
updater/deleter XMAX on the tuple itself, which IMO is more useful for
forensic purposes than a multixid. You lose the CID and CTID in the
tuple (for tuples that are updated and locked at the same time), but if
you keep the TED around longer, you have all the information still
there. On the whole, I don't think this is much worse than the current
situation.

- Heikki

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2015-05-12 08:33:05 Re: Patch to improve a few appendStringInfo* calls
Previous Message Pavel Stehule 2015-05-12 07:25:50 proposal: contrib module - generic command scheduler