From: | "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com> |
---|---|
To: | 'Michael Paquier' <michael(at)paquier(dot)xyz>, vignesh C <vignesh21(at)gmail(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com> |
Subject: | RE: pg_logical_slot_get_changes waits continously for a partial WAL record spanning across 2 pages |
Date: | 2025-06-26 02:20:05 |
Message-ID: | OSCPR01MB14966089C4A0C1F3AF1E8E8A5F57AA@OSCPR01MB14966.jpnprd01.prod.outlook.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Dear Michael, Vignesh,
> On Wed, Jun 25, 2025 at 10:19:55PM +0530, vignesh C wrote:
> > Currently, the logic attempts to read the complete WAL record based on
> > the size obtained before the crash—even though only a partial record
> > was written. It then checks the page header to determine whether the
> > XLP_FIRST_IS_OVERWRITE_CONTRECORD flag is set only after reading the
> > complete WAL record at XLogDecodeNextRecord function, but since that
> > much WAL data was not available in the system we never get a chance to
> > check the header after this.. To address this issue, a more robust
> > approach would be to first read the page header, check if the
> > XLP_FIRST_IS_OVERWRITE_CONTRECORD flag is set, and only then proceed
> > to read the WAL record size if the record is not marked as a partial
> > overwrite. This would prevent the system from waiting for WAL data
> > that will never arrive. Attached partial_wal_record_fix.patch patch
> > for this.
>
> So you are suggesting the addition of an extra ReadPageInternal() that
> forces a read of only the read, perform the checks on the header, then
> read the rest. After reading SizeOfXLogShortPHD worth of data,
> shouldn't the checks on xlp_rem_len be done a bit earlier than what
> you are proposing in this patch?
I have a concern for the performance perspective. This approach must read the
page twice in any cases, right? The workaround is needed only for the corner case
but affects for all the passes. Or, is it actually negligible?
> Another reliable approach would be to make the
> code wait before reading the record in the internal loop of
> ReadPageInternal() with an injection point when we know that we have a
> contrecord, but I'm not really excited about this prospect in
> xlogreader.c which can be also used in the frontend.
Per my understanding an injection point must be added while flushing a WAL record,
to emulate the incomplete WAL record issue. To confirm, how can it be used in
ReadPageInternal()?
Best regards,
Hayato Kuroda
FUJITSU LIMITED
From | Date | Subject | |
---|---|---|---|
Next Message | Zhijie Hou (Fujitsu) | 2025-06-26 03:01:21 | RE: Conflict detection for update_deleted in logical replication |
Previous Message | Richard Guo | 2025-06-26 02:01:35 | Re: Eager aggregation, take 3 |