Re: new heapcheck contrib module

From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robert(dot)haas(at)enterprisedb(dot)com>
Cc: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: new heapcheck contrib module
Date: 2020-04-20 20:30:26
Message-ID: 20200420203026.nmyq3bec3dvzy3ky@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2020-04-20 15:59:49 -0400, Robert Haas wrote:
> On Mon, Apr 20, 2020 at 3:42 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > I don't think random interspersed uses of CLogTruncationLock are a good
> > idea. If you move to only checking visibility after tuple fits into
> > [relfrozenxid, nextXid), then you don't need to take any locks here, as
> > long as a lock against vacuum is taken (which I think this should do
> > anyway).
>
> I think it would be *really* good to avoid ShareUpdateExclusiveLock
> here. Running with only AccessShareLock would be a big advantage. I
> agree that any use of CLogTruncationLock should not be "random", but I
> don't see why the same method we use to make txid_status() safe to
> expose to SQL shouldn't also be used here.

A few billion CLogTruncationLock acquisitions in short order will likely
have at least as big an impact as ShareUpdateExclusiveLock held for the
duration of the check. That's not really a relevant concern or
txid_status(). Per-tuple lock acquisitions aren't great.

I think it might be doable to not need either. E.g. we could set the
checking backend's xmin to relfrozenxid, and set somethign like
PROC_IN_VACUUM. That should, I think, prevent clog from being truncated
in a problematic way (clog truncations look at PROC_IN_VACUUM backends),
while not blocking vacuum.

The similar concern for ReadNewTransactionId() can probably more easily
be addressed, by only calling ReadNewTransactionId() when encountering
an xid that's newer than the last value read.

I think it'd be good to set PROC_IN_VACUUM (or maybe a separate version
of it) while checking anyway. Reading the full relation can take quite a
while, and we shouldn't prevent hot pruning while doing so.

There's some things we'd need to figure out to be able to use
PROC_IN_VACUUM, as that's really only safe in some
circumstances. Possibly it'd be easiest to address that if we'd make the
check a procedure...

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2020-04-20 20:36:16 Re: design for parallel backup
Previous Message Robert Haas 2020-04-20 20:21:25 Re: design for parallel backup