Re: Lowering the ever-growing heap->pd_lower

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Cc: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Lowering the ever-growing heap->pd_lower
Date: 2021-03-09 21:35:55
Message-ID: 2045371.1615325755@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com> writes:
> The only two existing mechanisms that I could find (in the access/heap
> directory) that possibly could fail on shrunken line pointer arrays;
> being xlog recovery (I do not have enough knowledge on recovery to
> determine if that may touch pages that have shrunken line pointer
> arrays, or if those situations won't exist due to never using dirtied
> pages in recovery) and backwards table scans on non-MVCC snapshots
> (which would be fixed in the attached patch).

I think you're not visualizing the problem properly.

The case I was concerned about back when is that there are various bits of
code that may visit a page with a predetermined TID in mind to look at.
An index lookup is an obvious example, and another one is chasing an
update chain's t_ctid link. You might argue that if the tuple was dead
enough to be removed, there should be no such in-flight references to
worry about, but I think such an assumption is unsafe. There is not, for
example, any interlock that ensures that a process that has obtained a TID
from an index will have completed its heap visit before a VACUUM that
subsequently removed that index entry also removes the heap tuple.

So, to accept a patch that shortens the line pointer array, what we need
to do is verify that every such code path checks for an out-of-range
offset before trying to fetch the target line pointer. I believed
back in 2007 that there were, or once had been, code paths that omitted
such a range check, assuming that they could trust the TID they had
gotten from $wherever to point at an extant line pointer array entry.
Maybe things are all good now, but I think you should run around and
examine every place that checks for tuple deadness to see if the offset
it used is known to be within the current page bounds.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Dilger 2021-03-09 21:47:18 Re: Lowering the ever-growing heap->pd_lower
Previous Message Bossart, Nathan 2021-03-09 21:33:31 Re: partial heap only tuples