Re: Non-replayable WAL records through overflows and >MaxAllocSize lengths

From: Andres Freund <andres(at)anarazel(dot)de>
To: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, David Zhang <david(dot)zhang(at)highgo(dot)ca>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Non-replayable WAL records through overflows and >MaxAllocSize lengths
Date: 2022-12-02 17:09:13
Message-ID: 20221202170913.rhhryfx5eosvjcib@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2022-07-26 18:58:02 +0200, Matthias van de Meent wrote:
> - updated the MaxXLogRecordSize and XLogRecordLengthIsValid(len)
> macros (now in xlogrecord.h), with a max length of the somewhat
> arbitrary 1020MiB.
> This leaves room for approx. 4MiB of per-record allocation overhead
> before you'd hit MaxAllocSize, and also detaches the dependency on
> memutils.h.
>
> - Retained the check in XLogRegisterData, so that we check against
> integer overflows in the registerdata code instead of only an assert
> in XLogRecordAssemble where it might be too late.
> - Kept the inline static elog-ing function (as per Andres' suggestion
> on 2022-03-14; this decreases binary sizes)

I don't think it should be a static inline. It should to be a *non* inlined
function, so we don't include the code for the elog in the callers.

> +/*
> + * Error due to exceeding the maximum size of a WAL record, or registering
> + * more datas than are being accounted for by the XLog infrastructure.
> + */
> +static inline void
> +XLogErrorDataLimitExceeded()
> +{
> + elog(ERROR, "too much WAL data");
> +}

I think this should be pg_noinline, as mentioned above.

> /*
> * Begin constructing a WAL record. This must be called before the
> * XLogRegister* functions and XLogInsert().
> @@ -348,14 +359,29 @@ XLogRegisterBlock(uint8 block_id, RelFileLocator *rlocator, ForkNumber forknum,
> * XLogRecGetData().
> */
> void
> -XLogRegisterData(char *data, int len)
> +XLogRegisterData(char *data, uint32 len)
> {
> XLogRecData *rdata;
>
> - Assert(begininsert_called);
> + Assert(begininsert_called && XLogRecordLengthIsValid(len));
> +
> + /*
> + * Check against max_rdatas; and ensure we don't fill a record with
> + * more data than can be replayed. Records are allocated in one chunk
> + * with some overhead, so ensure XLogRecordLengthIsValid() for that
> + * size of record.
> + *
> + * Additionally, check that we don't accidentally overflow the
> + * intermediate sum value on 32-bit systems by ensuring that the
> + * sum of the two inputs is no less than one of the inputs.
> + */
> + if (num_rdatas >= max_rdatas ||
> +#if SIZEOF_SIZE_T == 4
> + mainrdata_len + len < len ||
> +#endif
> + !XLogRecordLengthIsValid((size_t) mainrdata_len + (size_t) len))
> + XLogErrorDataLimitExceeded();

This is quite a complicated check, and the SIZEOF_SIZE_T == 4 bit is fairly
ugly.

I think we should make mainrdata_len a uint64, then we don't have to worry
about it overflowing on 32bit systems. And TBH, we don't care about some minor
inefficiency on 32bit systems.

> @@ -399,8 +425,16 @@ XLogRegisterBufData(uint8 block_id, char *data, int len)
> elog(ERROR, "no block with id %d registered with WAL insertion",
> block_id);
>
> - if (num_rdatas >= max_rdatas)
> - elog(ERROR, "too much WAL data");
> + /*
> + * Check against max_rdatas; and ensure we don't register more data per
> + * buffer than can be handled by the physical data format;
> + * i.e. that regbuf->rdata_len does not grow beyond what
> + * XLogRecordBlockHeader->data_length can hold.
> + */
> + if (num_rdatas >= max_rdatas ||
> + regbuf->rdata_len + len > UINT16_MAX)
> + XLogErrorDataLimitExceeded();
> +
> rdata = &rdatas[num_rdatas++];
>
> rdata->data = data;

This partially has been applied in ffd1b6bb6f8, I think we should consider
adding XLogErrorDataLimitExceeded() separately too.

> rdt_datas_last->next = regbuf->rdata_head;
> @@ -858,6 +907,16 @@ XLogRecordAssemble(RmgrId rmid, uint8 info,
> for (rdt = hdr_rdt.next; rdt != NULL; rdt = rdt->next)
> COMP_CRC32C(rdata_crc, rdt->data, rdt->len);
>
> + /*
> + * Ensure that the XLogRecord is not too large.
> + *
> + * XLogReader machinery is only able to handle records up to a certain
> + * size (ignoring machine resource limitations), so make sure we will
> + * not emit records larger than those sizes we advertise we support.
> + */
> + if (!XLogRecordLengthIsValid(total_len))
> + XLogErrorDataLimitExceeded();
> +
> /*
> * Fill in the fields in the record header. Prev-link is filled in later,
> * once we know where in the WAL the record will be inserted. The CRC does

I think this needs to mention that it'll typically cause a PANIC.

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-12-02 17:15:37 Re: Failed Assert in pgstat_assoc_relation
Previous Message Andres Freund 2022-12-02 16:57:17 Re: Non-replayable WAL records through overflows and >MaxAllocSize lengths