Re: Non-replayable WAL records through overflows and >MaxAllocSize lengths

From: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: David Zhang <david(dot)zhang(at)highgo(dot)ca>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Non-replayable WAL records through overflows and >MaxAllocSize lengths
Date: 2022-07-25 11:17:21
Message-ID: CAEze2Wg8SYN0DAL6HE7WifP+O9ukPttyjwbqd=e4L2mTTrGv=Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 13 Jul 2022 at 07:54, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Mon, Jul 11, 2022 at 02:26:46PM +0200, Matthias van de Meent wrote:
> > Thanks for reviewing.
>
> I think that v6 is over-engineered because there should be no need to
> add a check in xlogreader.c as long as the origin of the problem is
> blocked, no? And the origin here is when the record is assembled. At
> least this is the cleanest solution for HEAD, but not in the
> back-branches if we'd care about doing something with records already
> generated, and I am not sure that we need to care about other things
> than HEAD, TBH.

I would prefer it if we would fix the "cannot catch up to primary
because of oversized WAL record" issue in backbranches too. Rather
than failing to recover after failure or breaking replication streams,
I'd rather be unable to write the singular offending WAL record and
break up to one transaction.

> So it seems to me that there is no need to create a
> XLogRecMaxLength which is close to a duplicate of
> DecodeXLogRecordRequiredSpace().
>
> @@ -519,7 +549,7 @@ XLogRecordAssemble(RmgrId rmid, uint8 info,
> XLogRecPtr *fpw_lsn, int *num_fpi, bool *topxid_included)
>
> {
> XLogRecData *rdt;
> - uint32 total_len = 0;
> + uint64 total_len = 0;
> This has no need to change.
>
> My suggestion from upthread was close to what you proposed, but I had
> in mind something simpler, as of:
>
> + /*
> + * Ensure that xlogreader.c can read the record.
> + */
> + if (unlikely(!AllocSizeIsValid(DecodeXLogRecordRequiredSpace(total_len))))
> + elog(ERROR, "too much WAL data");

Huh, yeah, I hadn't thought of that, but that's much simpler indeed.

> This would be the amount of data allocated by the WAL reader when it
> is possible to allocate an oversized record, related to the business
> of the circular buffer depending on if the read is blocking or not.

Yes, I see your point.

> Among the two problems to solve at hand, the parts where the APIs are
> changed and made more robust with unsigned types and where block data
> is not overflowed with its 16-byte limit are committable, so I'd like
> to do that first (still need to check its performance with some micro
> benchmark on XLogRegisterBufData()).

> The second part to block the
> creation of the assembled record is simpler, now
> DecodeXLogRecordRequiredSpace() would make the path a bit hotter,
> though we could inline it in the worst case?

I think that would be better for performance, yes.
DecodeXLogRecordRequiredSpace will already be optimized to just a
single addition by any of `-O[123]`, so keeping this indirection is
quite expensive (relative to the operation being performed).

As for your patch patch:

> +XLogRegisterData(char *data, uint32 len)
> {
> XLogRecData *rdata;
>
> Assert(begininsert_called);
>
> - if (num_rdatas >= max_rdatas)
> + if (unlikely(num_rdatas >= max_rdatas))
> elog(ERROR, "too much WAL data");
> rdata = &rdatas[num_rdatas++];

XLogRegisterData is designed to be called multiple times for each
record, and this allows the user of the API to overflow the internal
mainrdata_len field if we don't check that the field does not exceed
the maximum record length (or overflow the 32-bit field). As such, I'd
still want a len-check in that function.

I'll send an updated patch by tomorrow.

- Matthias

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Junwang Zhao 2022-07-25 11:31:08 [PATCH v1] strengthen backup history filename check
Previous Message Heikki Linnakangas 2022-07-25 11:12:05 Re: Non-replayable WAL records through overflows and >MaxAllocSize lengths