Re: Catalog invalidations vs catalog scans vs ScanPgRelation()

From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Catalog invalidations vs catalog scans vs ScanPgRelation()
Date: 2020-03-28 20:54:10
Message-ID: 20200328205410.6naxecu7lgbeqyqt@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2020-03-28 12:30:40 -0700, Andres Freund wrote:
> On 2020-02-28 22:10:52 -0800, Andres Freund wrote:
> > On 2020-02-28 21:24:59 -0800, Andres Freund wrote:
> > So, um. What happens is that doDeletion() does a catalog scan, which
> > sets a snapshot. The results of that catalog scan are then used to
> > perform modifications. But at that point there's no guarantee that we
> > still hold *any* snapshot, as e.g. invalidations can trigger the catalog
> > snapshot being released.
> >
> > I don't see how that's safe. Without ->xmin preventing that,
> > intermediate row versions that we did look up could just get vacuumed
> > away, and replaced with a different row. That does seem like a serious
> > issue?
> >
> > I think there's likely a lot of places that can hit this? What makes it
> > safe for InvalidateCatalogSnapshot()->SnapshotResetXmin() to release
> > ->xmin when there previously has been *any* catalog access? Because in
> > contrast to normal table modifications, there's currently nothing at all
> > forcing us to hold a snapshot between catalog lookups an their
> > modifications?
> >
> > Am I missing something? Or is this a fairly significant hole in our
> > arrangements?
> >
> > The easiest way to fix this would probably be to have inval.c call a
> > version of InvalidateCatalogSnapshot() that leaves the oldest catalog
> > snapshot around, but sets up things so that GetCatalogSnapshot() will
> > return a freshly taken snapshot? ISTM that pretty much every
> > InvalidateCatalogSnapshot() call within a transaction needs that behaviour?
>
> I'd still like to get some input here.

Attached is a one patch that adds assertions to detect this, and one
that puts enough workarounds in place to make the tests pass. I don't
like this much, but I thought it'd be useful for others to understand
the problem.

Greetings,

Andres Freund

Attachment Content-Type Size
v1-0001-TMP-work-around-missing-snapshot-registrations.patch text/x-diff 8.5 KB
v1-0002-Improve-and-extend-asserts-for-a-snapshot-being-s.patch text/x-diff 6.4 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2020-03-28 21:30:23 Potential (low likelihood) wraparound hazard in heap_abort_speculative()
Previous Message Andrew Dunstan 2020-03-28 19:35:07 Re: ssl passphrase callback