Re: [BUG]"FailedAssertion" reported in lazy_scan_heap() when running logical replication

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, "Amit(dot)Kapila(at)fujitsu(dot)com" <Amit(dot)Kapila(at)fujitsu(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [BUG]"FailedAssertion" reported in lazy_scan_heap() when running logical replication
Date: 2021-05-06 20:35:56
Message-ID: CAH2-Wzm=PDM5R3_oJA-9eR98Y1CjfAuEQco2T15zpTRh1YD8=w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, May 6, 2021 at 12:32 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> I think it'd be a good idea to audit the other uses of
> all_visible_according_to_vm to make sure there's no issues there. Can't
> this e.g. make us miss setting all-visible in
>
> /*
> * Handle setting visibility map bit based on what the VM said about
> * the page before pruning started, and using prunestate
> */
> if (!all_visible_according_to_vm && prunestate.all_visible)

I don't think so, because it's the inverse case -- the condition that
you quote is concerned with the case where we found the VM all_visible
bit to not be set earlier, and then found that we could set it now.

The assertion failed because the VM's all_visible bit was set
initially, but concurrently unset by some other backend. The
all_visible_according_to_vm tracking variable became stale, so it
wasn't correct to expect current information from prunestate to agree
that the page is still all_visible.

High level philosophical observation: This reminds me of another way
in which things are too tightly coupled in VACUUM. It's really a pity
that the visibility map's all_visible bit serves two purposes -- it
remembers pages that VACUUM doesn't have to visit (except perhaps if
it's an aggressive VACUUM), and is also used for index-only scans. If
it was just used for index-only scans then I don't think it would be
necessary for a HOT update to unset a page's all_visible bit. Since a
HOT chain's members are always versions of the same logical row, there
is no reason why an index-only scan needs to care which precise
version is actually visible to its MVCC snapshot (once we know that
there must be exactly one version from each HOT chain).

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message 盏一 2021-05-06 20:36:25 Re: use `proc->pgxactoff` as the value of `index` in `ProcArrayRemove()`
Previous Message Tomas Vondra 2021-05-06 20:25:06 Re: cache lookup failed for statistics object 123