Re: Non-replayable WAL records through overflows and >MaxAllocSize lengths

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Non-replayable WAL records through overflows and >MaxAllocSize lengths
Date: 2022-03-11 20:42:42
Message-ID: 7e44dc64-c6e3-fd4e-4c05-ad36555f57f5@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11/03/2022 17:42, Matthias van de Meent wrote:
> Hi,
>
> Xlogreader limits the size of what it considers valid xlog records to
> MaxAllocSize; but this is not currently enforced in the
> XLogRecAssemble API. This means it is possible to assemble a record
> that postgresql cannot replay.

Oops, that would be nasty.

> Similarly; it is possible to repeatedly call XlogRegisterData() so as
> to overflow rec->xl_tot_len; resulting in out-of-bounds reads and
> writes while processing record data;

And that too.

Have you been able to create a test case for that? The largest record I
can think of is a commit record with a huge number of subtransactions,
dropped relations, and shared inval messages. I'm not sure if you can
overflow a uint32 with that, but exceeding MaxAllocSize seems possible.

> PFA a patch that attempts to fix both of these issues in the insertion
> API; by checking against overflows and other incorrectly large values
> in the relevant functions in xloginsert.c. In this patch, I've also
> added a comment to the XLogRecord spec to document that xl_tot_len
> should not be larger than 1GB - 1B; and why that limit exists.
> diff --git a/src/backend/access/transam/xloginsert.c b/src/backend/access/transam/xloginsert.c
> index c260310c4c..ae654177de 100644
> --- a/src/backend/access/transam/xloginsert.c
> +++ b/src/backend/access/transam/xloginsert.c
> @@ -342,6 +342,11 @@ XLogRegisterData(char *data, int len)
>
> if (num_rdatas >= max_rdatas)
> elog(ERROR, "too much WAL data");
> +
> + /* protect against overflow */
> + if (unlikely((uint64) mainrdata_len + (uint64) len > UINT32_MAX))
> + elog(ERROR, "too much WAL data");
> +
> rdata = &rdatas[num_rdatas++];
>
> rdata->data = data;

Could check for just AllocSizeValid(mainrdata_len), if we're only
worried about the total size of the data to exceed the limit, and assume
that each individual piece of data is smaller.

We also don't check for negative 'len'. I think that's fine, the caller
bears some responsibility for passing valid arguments too. But maybe
uint32 or size_t would be more appropriate here.

I wonder if these checks hurt performance. These are very cheap, but
then again, this codepath is very hot. It's probably fine, but it still
worries me a little. Maybe some of these could be Asserts.

> @@ -387,6 +392,11 @@ XLogRegisterBufData(uint8 block_id, char *data, int len)
>
> if (num_rdatas >= max_rdatas)
> elog(ERROR, "too much WAL data");
> +
> + /* protect against overflow */
> + if (unlikely((uint64) regbuf->rdata_len + (uint64) len > UINT32_MAX))
> + elog(ERROR, "too much WAL data");
> +
> rdata = &rdatas[num_rdatas++];
>
> rdata->data = data;

Could check "len > UINT16_MAX". As you noted in XLogRecordAssemble,
that's the real limit. And if you check for that here, you don't need to
check it in XLogRecordAssemble.

> @@ -505,7 +515,7 @@ XLogRecordAssemble(RmgrId rmid, uint8 info,
> XLogRecPtr *fpw_lsn, int *num_fpi, bool *topxid_included)
> {
> XLogRecData *rdt;
> - uint32 total_len = 0;
> + uint64 total_len = 0;
> int block_id;
> pg_crc32c rdata_crc;
> registered_buffer *prev_regbuf = NULL;

I don't think the change to uint64 is necessary. If all the data blocks
are limited to 64 kB, and the number of blocks is limited, and the
number of blocks is limited too.

> @@ -734,6 +744,10 @@ XLogRecordAssemble(RmgrId rmid, uint8 info,
>
> if (needs_data)
> {
> + /* protect against overflow */
> + if (unlikely(regbuf->rdata_len > UINT16_MAX))
> + elog(ERROR, "too much WAL data for registered buffer");
> +
> /*
> * Link the caller-supplied rdata chain for this buffer to the
> * overall list.
> @@ -836,6 +850,13 @@ XLogRecordAssemble(RmgrId rmid, uint8 info,
> for (rdt = hdr_rdt.next; rdt != NULL; rdt = rdt->next)
> COMP_CRC32C(rdata_crc, rdt->data, rdt->len);
>
> + /*
> + * Ensure that xlogreader.c can read the record; and check that we don't
> + * accidentally overflow the size of the record.
> + * */
> + if (unlikely(!AllocSizeIsValid(total_len) || total_len > UINT32_MAX))
> + elog(ERROR, "too much registered data for WAL record");
> +
> /*
> * Fill in the fields in the record header. Prev-link is filled in later,
> * once we know where in the WAL the record will be inserted. The CRC does

It's enough to check AllocSizeIsValid(total_len), no need to also check
against UINT32_MAX.

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Melanie Plageman 2022-03-11 20:44:15 Issue with pg_stat_subscription_stats
Previous Message Robert Haas 2022-03-11 20:39:13 Re: pg_walinspect - a new extension to get raw WAL data and WAL stats