Re: Non-replayable WAL records through overflows and >MaxAllocSize lengths

From: David Zhang <david(dot)zhang(at)highgo(dot)ca>
To: Michael Paquier <michael(at)paquier(dot)xyz>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Non-replayable WAL records through overflows and >MaxAllocSize lengths
Date: 2022-06-10 23:31:53
Message-ID: 57351896-8e45-1332-5416-0f562b987df9@highgo.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

> > MaxAllocSize is pretty easy:
> > SELECT pg_logical_emit_message(false, long, long) FROM
repeat(repeat(' ', 1024), 1024*1023) as l(long);
> >
> > on a standby:
> >
> > 2022-03-11 16:41:59.336 PST [3639744][startup][1/0:0] LOG:  record
length 2145386550 at 0/3000060 too long
>
> Thanks for the reference. I was already playing around with 2PC log
> records (which can theoretically contain >4GB of data); but your
> example is much easier and takes significantly less time.

A little confused here, does this patch V3 intend to solve this problem
"record length 2145386550 at 0/3000060 too long"?

I set up a simple Primary and Standby stream replication environment,
and use the above query to run the test for before and after patch v3.
The error message still exist, but with different message.

Before patch v3, the error is showing below,

2022-06-10 15:32:25.307 PDT [4253] LOG: record length 2145386550 at
0/3000060 too long
2022-06-10 15:32:47.763 PDT [4257] FATAL:  terminating walreceiver
process due to administrator command
2022-06-10 15:32:47.763 PDT [4253] LOG:  record length 2145386550 at
0/3000060 too long

After patch v3, the error displays differently

2022-06-10 15:53:53.397 PDT [12848] LOG: record length 2145386550 at
0/3000060 too long
2022-06-10 15:54:07.249 PDT [12852] FATAL:  could not receive data from
WAL stream: ERROR:  requested WAL segment 000000010000000000000045 has
already been removed
2022-06-10 15:54:07.275 PDT [12848] LOG:  record length 2145386550 at
0/3000060 too long

And once the error happens, then the Standby can't continue the replication.

Is a particular reason to say "more datas" at line 52 in patch v3?

+ * more datas than are being accounted for by the XLog infrastructure.

On 2022-04-18 10:19 p.m., Michael Paquier wrote:
> On Mon, Apr 18, 2022 at 05:48:50PM +0200, Matthias van de Meent wrote:
>> Seeing that the busiest time for PG15 - the last commitfest before the
>> feature freeze - has passed, could someone take another look at this?
> The next minor release is three weeks away, so now would be a good
> time to get that addressed. Heikki, Andres, are you planning to look
> more at what has been proposed here?
> --
> Michael

Thank you,

--
David

Software Engineer
Highgo Software Inc. (Canada)
www.highgo.ca

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2022-06-11 01:36:22 Re: Replica Identity check of partition table on subscriber
Previous Message Stephen Frost 2022-06-10 22:40:03 Re: replacing role-level NOINHERIT with a grant-level option