Re: Panic during xlog building with big values

From: Andy Pogrebnoi <andrew(dot)pogrebnoi(at)percona(dot)com>
To: "Maksim(dot)Melnikov" <m(dot)melnikov(at)postgrespro(dot)ru>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Panic during xlog building with big values
Date: 2025-09-26 12:33:22
Message-ID: 18639C15-3830-492E-B5F0-9924C43C6C5C@percona.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

> On Jul 7, 2025, at 17:05, Maksim.Melnikov <m(dot)melnikov(at)postgrespro(dot)ru> wrote:
>
> Hello,
> during testing we found following issue when update records with big values.
>
> 2025-07-07 14:40:30.434 MSK [125435] PANIC: oversized WAL record
> 2025-07-07 14:40:30.434 MSK [125435] DETAIL: WAL record would be 1073742015 bytes (of maximum 1069547520 bytes); rmid 10 flags 64.
>
> tested commit: 62a17a92833d1eaa60d8ea372663290942a1e8eb
>
> Test description:
>
> set wal_level = logical in postgresql.conf
>
> CREATE DATABASE regression_big_values WITH TEMPLATE = template0 ENCODING = 'UTF8';
> \c regression_big_values
> CREATE TABLE big_text_test (i int, c1 text, c2 text);
> -- Mark columns as toastable, but don't try to compress
> ALTER TABLE big_text_test ALTER c1 SET STORAGE EXTERNAL;
> ALTER TABLE big_text_test ALTER c2 SET STORAGE EXTERNAL;
> ALTER TABLE big_text_test REPLICA IDENTITY FULL;
> INSERT INTO big_text_test (i, c1, c2) VALUES (1, repeat('a', 1073741737), NULL);
> UPDATE big_text_test SET c2 = repeat('b', 1073741717);
>

I tried the patch and it fixes the test case. Now it produces an ERROR instead of PANIC.

I’m wondering, though, if there are other places that can produce huge records besides ExtractReplicaIdentity? Andres Freund has also suggested changes in RecordTransactionCommit(), for example [1].

> @@ -9043,6 +9044,33 @@ log_heap_update(Relation reln, Buffer oldbuf,
> return recptr;
> }
>
> +/*
> + * Pre-check potential XLogRecord oversize. XLogRecord will be created
> + * later, and it size will be checked, but it will occur in critical
> + * section and in case of failure core dump will be generated.
> + * It seems not good, so to avoid this, we can calculate approximate
> + * xlog record size here and check it.
> + *
> + * Size prediction is based on xlog update and xlog delete logic and can
> + * be revised in case of it changing, now buf size is limited by
> + * UINT16_MAX(Assert(regbuf->rdata_len <= UINT16_MAX) in xloginsert).
> + *
> + * Anyway to accommodate some overhead, 1M is substract from predicted
> + * value. It seems now it is quite enough.
> + */

I also suggest tidying up grammar and syntax a bit in the comment above. My variant would be:

/*
* Pre-check potential XLogRecord oversize. XLogRecord will be created
* later, and its size will be checked. However, this operation will
* occur within a critical section, and in the event of failure, a core
* dump will be generated.
* It does not seem good, so to avoid this, we can calculate the approximate
* xlog record size here and check it.
*
* Size prediction is based on xlog update and xlog delete logic, and can
* be revised if it changes. For now, the buf size is limited by
* UINT16_MAX (Assert(regbuf->rdata_len <= UINT16_MAX) in xloginsert).
*
* Anyway, to accommodate some overhead, 1M is subtracted from the predicted
* value. It seems like that's enough for now.
*/

Cheers,
Andy

[1] https://www.postgresql.org/message-id/20221202165717.wtdd5ijoqawrdt75%40awork3.anarazel.de

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2025-09-26 12:44:28 Re: Report bytes and transactions actually sent downtream
Previous Message Álvaro Herrera 2025-09-26 12:28:52 Re: refactor backend type lists