Re: Lack of PageSetLSN in heap_xlog_visible

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Konstantin Knizhnik <knizhnik(at)garret(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com>
Subject: Re: Lack of PageSetLSN in heap_xlog_visible
Date: 2022-10-13 19:49:35
Message-ID: 039076d4f6cdd871691686361f83cb8a6913a86a.camel@j-davis.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 2022-10-13 at 12:50 +0300, Konstantin Knizhnik wrote:
>          /*
>           * We don't bump the LSN of the heap page when setting the
> visibility
>           * map bit (unless checksums or wal_hint_bits is enabled, in
> which
>           * case we must), because that would generate an unworkable
> volume of
>           * full-page writes.

It clearly says there that it must set the page LSN, but I don't see
where that's happening. It seems to go all the way back to the original
checksums commit, 96ef3b8ff1.

I can reproduce a case where a replica ends up with a different page
header than the primary (checksums enabled):

Primary:
create extension pageinspect;
create table t(i int) with (autovacuum_enabled=off);
insert into t values(0);

Shut down and restart primary and replica.

Primary:
insert into t values(1);
vacuum t;

Crash replica and let it recover.

Shut down and restart primary and replica.

Primary:
select * from page_header(get_raw_page('t', 0));

Replica:
select * from page_header(get_raw_page('t', 0));

The LSN on the replica is lower, but the flags are the same
(PD_ALL_VISIBLE set). That's a problem, right? The checksums are valid
on both, though.

It may violate our torn page protections for checksums, as well, but I
couldn't construct a scenario for that because recovery can only create
restartpoints at certain times.

> But it still not clear for me that not bumping LSN in this place is
> correct if wal_log_hints is set.
> In this case we will have VM page with larger LSN than heap page,
> because visibilitymap_set
> bumps LSN of VM page. It means that in theory after recovery we may
> have
> page marked as all-visible in VM,
> but not having PD_ALL_VISIBLE  in page header. And it violates VM
> constraint:

I'm not quite following this scenario. If the heap page has a lower LSN
than the VM page, how could we recover to a point where the VM bit is
set but the heap flag isn't? And what does it have to do with
wal_log_hints/checksums?

--
Jeff Davis
PostgreSQL Contributor Team - AWS

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2022-10-13 19:59:43 Re: remove redundant memset() call
Previous Message Andres Freund 2022-10-13 19:48:20 New "single-call SRF" APIs are very confusingly named