Re: WIP: WAL prefetch (another approach)

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, David Steele <david(at)pgmasters(dot)net>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Jakub Wartak <Jakub(dot)Wartak(at)tomtom(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: WAL prefetch (another approach)
Date: 2021-04-29 02:25:53
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 2021-04-28 17:59:22 -0700, Andres Freund wrote:
> I can however say that pg_waldump on the standby's pg_wal does also
> fail. The failure as part of the backend is "invalid memory alloc
> request size", whereas in pg_waldump I get the much more helpful:
> pg_waldump: fatal: error in WAL record at 4/7F5C31C8: record with incorrect prev-link 416200FF/FF000000 at 4/7F5C3200

There's definitely something broken around continuation records, in
XLogFindNextRecord(). Which means that it's not the cause for the server
side issue, but obviously still not good.

The conversion of XLogFindNextRecord() to be state machine based
basically only works in a narrow set of circumstances. Whenever the end
of the first record read is on a different page than the start of the
record, we'll endlessly loop.

We'll go into XLogFindNextRecord(), and return until we've successfully
read the page header. Then we'll enter the second loop. Which will try
to read until the end of the first record. But after returning the first
loop will again ask for page header.

Even if that's fixed, the second loop alone has the same problem: As
XLogBeginRead() is called unconditionally we'll start read the start of
the record, discover that it needs data on a second page, return, and
do the same thing again.

I think it needs something roughly like the attached.


Andres Freund

Attachment Content-Type Size
fix-xlogfindnext.diff text/x-diff 2.0 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2021-04-29 02:33:46 Re: [PATCH] We install pg_regress and isolationtester but not pg_isolation_regress
Previous Message Ajin Cherian 2021-04-29 02:18:34 Re: [HACKERS] logical decoding of two-phase transactions