Re: [BUG]"FailedAssertion" reported in lazy_scan_heap() when running logical replication

From: Andres Freund <andres(at)anarazel(dot)de>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, "Amit(dot)Kapila(at)fujitsu(dot)com" <Amit(dot)Kapila(at)fujitsu(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [BUG]"FailedAssertion" reported in lazy_scan_heap() when running logical replication
Date: 2021-05-06 19:32:29
Message-ID: 20210506193229.nspubcypujg26scq@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2021-05-06 21:40:18 +0900, Masahiko Sawada wrote:
> Since we set all_visible_according_to_vm before acquiring the buffer
> lock it's likely to happen that the page gets modified and all-visible
> bit is cleared after setting true to all_visible_according_to_vm. This
> assertion can easily be reproduced by adding a delay before the buffer
> lock and invoking autovacuums frequently:

> So we should recheck also visibility map bit there but I think we can
> remove this assertion since we already do that in later code and we
> don’t treat this case as a should-not-happen case:
>
> /*
> * As of PostgreSQL 9.2, the visibility map bit should never be set if
> * the page-level bit is clear. However, it's possible that the bit
> * got cleared after we checked it and before we took the buffer
> * content lock, so we must recheck before jumping to the conclusion
> * that something bad has happened.
> */
> else if (all_visible_according_to_vm && !PageIsAllVisible(page)
> && VM_ALL_VISIBLE(vacrel->rel, blkno, &vmbuffer))
> {
> elog(WARNING, "page is not marked all-visible but
> visibility map bit is set in relation \"%s\" page %u",
> vacrel->relname, blkno);
> visibilitymap_clear(vacrel->rel, blkno, vmbuffer,
> VISIBILITYMAP_VALID_BITS);
> }
>
> I've attached a patch removing the assertion.

I think it'd be a good idea to audit the other uses of
all_visible_according_to_vm to make sure there's no issues there. Can't
this e.g. make us miss setting all-visible in

/*
* Handle setting visibility map bit based on what the VM said about
* the page before pruning started, and using prunestate
*/
if (!all_visible_according_to_vm && prunestate.all_visible)

Perhaps we should update all_visible_according_to_vm after locking the
buffer, if PageIsAllVisible(page) is true?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2021-05-06 19:45:20 Re: use `proc->pgxactoff` as the value of `index` in `ProcArrayRemove()`
Previous Message Tom Lane 2021-05-06 19:31:02 Re: Printing backtrace of postgres processes