Re: pg_logical_slot_get_changes waits continously for a partial WAL record spanning across 2 pages

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>
Subject: Re: pg_logical_slot_get_changes waits continously for a partial WAL record spanning across 2 pages
Date: 2025-06-26 03:18:32
Message-ID: CAFiTN-sFPgNJ=SUe09Ouc+DxFhipOJSzLDq4WWZfc-o2-UfK2g@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jun 26, 2025 at 6:22 AM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Wed, Jun 25, 2025 at 10:19:55PM +0530, vignesh C wrote:
> > Currently, the logic attempts to read the complete WAL record based on
> > the size obtained before the crash—even though only a partial record
> > was written. It then checks the page header to determine whether the
> > XLP_FIRST_IS_OVERWRITE_CONTRECORD flag is set only after reading the
> > complete WAL record at XLogDecodeNextRecord function, but since that
> > much WAL data was not available in the system we never get a chance to
> > check the header after this.. To address this issue, a more robust
> > approach would be to first read the page header, check if the
> > XLP_FIRST_IS_OVERWRITE_CONTRECORD flag is set, and only then proceed
> > to read the WAL record size if the record is not marked as a partial
> > overwrite. This would prevent the system from waiting for WAL data
> > that will never arrive. Attached partial_wal_record_fix.patch patch
> > for this.

Yeah this is a problem, I am not sure at the moment I can think of
anything better than just reading the header first and checking the
XLP_FIRST_IS_OVERWRITE_CONTRECORD flag.

>
> So you are suggesting the addition of an extra ReadPageInternal() that
> forces a read of only the read, perform the checks on the header, then
> read the rest. After reading SizeOfXLogShortPHD worth of data,
> shouldn't the checks on xlp_rem_len be done a bit earlier than what
> you are proposing in this patch?

I did not get the point, IMHO it has to be validated after the record
on the next page has been read.

--
Regards,
Dilip Kumar
Google

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2025-06-26 03:36:10 Re: Skipping schema changes in publication
Previous Message Robert Treat 2025-06-26 03:04:37 Re: Adding OLD/NEW support to RETURNING