Re: XLogBeginRead's header comment lies

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: XLogBeginRead's header comment lies
Date: 2022-08-17 05:48:55
Message-ID: CAFiTN-sT_e6=emap8r=KyFa_fbQ4GkFLDYQrbtEfyus_AXMorQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 16, 2022 at 11:28 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> It claims that:
>
> * 'RecPtr' should point to the beginning of a valid WAL record. Pointing at
> * the beginning of a page is also OK, if there is a new record right after
> * the page header, i.e. not a continuation.
>
> But this actually doesn't seem to work. This function doesn't itself
> have any problem with any LSNs you want to pass it, so if you call
> this function with an LSN that is at the beginning of a page, you'll
> end up with EndRecPtr set to the LSN you specify and DecodeRecPtr set
> to NULL. When you then call XLogReadRecord, you'll reach
> XLogDecodeNextRecord, which will do this:
>
> if (state->DecodeRecPtr != InvalidXLogRecPtr)
> {
> /* read the record after the one we just read */
>
> /*
> * NextRecPtr is pointing to end+1 of the previous WAL record. If
> * we're at a page boundary, no more records can fit on the current
> * page. We must skip over the page header, but we can't do that until
> * we've read in the page, since the header size is variable.
> */
> }
> else
> {
> /*
> * Caller supplied a position to start at.
> *
> * In this case, NextRecPtr should already be pointing to a valid
> * record starting position.
> */
> Assert(XRecOffIsValid(RecPtr));
> randAccess = true;
> }
>
> Since DecodeRecPtr is NULL, you take the else branch, and then you
> fail an assertion.
>
> I tried adding a --beginread argument to pg_waldump (patch attached)
> to further verify this:
>
> [rhaas pgsql]$ pg_waldump -n1
> /Users/rhaas/pgstandby/pg_wal/0000000200000005000000A0
> rmgr: Heap len (rec/tot): 72/ 72, tx: 5778572, lsn:
> 5/A0000028, prev 5/9FFFFFB8, desc: HOT_UPDATE off 39 xmax 5778572
> flags 0x20 ; new off 62 xmax 0, blkref #0: rel 1663/16388/16402 blk 1
> [rhaas pgsql]$ pg_waldump -n1 --beginread
> /Users/rhaas/pgstandby/pg_wal/0000000200000005000000A0
> Assertion failed: (((RecPtr) % 8192 >= (((uintptr_t)
> ((sizeof(XLogPageHeaderData))) + ((8) - 1)) & ~((uintptr_t) ((8) -
> 1))))), function XLogDecodeNextRecord, file xlogreader.c, line 582.
> Abort trap: 6 (core dumped)
>
> The WAL record begins at offset 0x28 in the block, which I believe is
> the length of a long page header, so this is indeed a WAL segment that
> begins with a brand new record, not a continuation record.
>
> There are two ways we could fix this, I believe. One is to correct the
> comment at the start of XLogBeginRead() to reflect the way things
> actually work at present. The other is to correct the code to do what
> the header comment claims. I would prefer the latter, because I'd like
> to be able to use the EndRecPtr of the last record read by one
> xlogreader as the starting point for a new xlogreader created at a
> later time. I've found that, when there's no record spanning the block
> boundary, the EndRecPtr points to the start of the next block, not the
> start of the first record in the next block. I could dodge the problem
> here by just always using XLogFindNextRecord() rather than
> XLogBeginRecord(), but I'd actually like it to go boom if I somehow
> end up trying to start from an LSN that's in the middle of a record
> somewhere (or the middle of the page header) because those cases
> shouldn't happen. But if I just have an LSN that happens to be the
> start of the block header rather than the start of the record that
> follows the block header, I'd like that case to be tolerated, because
> the LSN I'm using came from the xlogreader machinery.
>
> Thoughts?

Yeah I think it makes sense to make it work as per the comment in
XLogBeginRecord(). I think if we modify the Assert as per the comment
of XLogBeginRecord() then the remaining code of the
XLogDecodeNextRecord() is capable enough to take care of skipping the
page header if we are pointing at the beginning of the block.

See attached patch.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
0001-Adjust-XLogDecodeNextRecord-assert-to-match-with-XLo.patch text/x-patch 1.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2022-08-17 05:56:25 Re: Materialized view rewrite is broken when there is an event trigger
Previous Message Bharath Rupireddy 2022-08-17 05:47:24 Re: add checkpoint stats of snapshot and mapping files of pg_logical dir