Re: Improve WALRead() to suck data directly from WAL buffers when possible

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Subject: Re: Improve WALRead() to suck data directly from WAL buffers when possible
Date: 2023-12-07 10:29:00
Message-ID: CALj2ACVfFMfqD5oLzZSQQZWfXiJqd-NdX0_317veP6FuB31QWA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 13, 2023 at 7:02 PM Bharath Rupireddy
<bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
>
> On Fri, Nov 10, 2023 at 2:28 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> > I think the code needs to make sure that *never* happens. That seems unrelated
> > to holding or not holding WALBufMappingLock. Even if the page header is
> > already valid, I don't think it's ok to just read/parse WAL data that's
> > concurrently being modified.
> >
> > We can never allow WAL being read that's past
> > XLogBytePosToRecPtr(XLogCtl->Insert->CurrBytePos)
> > as it does not exist.
>
> Agreed. Erroring out in XLogReadFromBuffers() if passed in WAL is past
> the CurrBytePos is an option. Another cleaner way is to just let the
> caller decide what it needs to do (retry or error out) - fill an error
> message in XLogReadFromBuffers() and return as-if nothing was read or
> return a special negative error code like XLogDecodeNextRecord so that
> the caller can deal with it.

In the attached v17 patch, I've ensured that the XLogReadFromBuffers
returns when the caller requests a WAL that's past the current insert
position at the moment.

> Also, reading CurrBytePos with insertpos_lck spinlock can come in the
> way of concurrent inserters. A possible way is to turn both
> CurrBytePos and PrevBytePos 64-bit atomics so that
> XLogReadFromBuffers() can read CurrBytePos without any lock atomically
> and leave it to the caller to deal with non-existing WAL reads.
>
> > And if the to-be-read LSN is between
> > XLogCtl->LogwrtResult->Write and XLogBytePosToRecPtr(Insert->CurrBytePos)
> > we need to call WaitXLogInsertionsToFinish() before copying the data.
>
> Agree to wait for all in-flight insertions to the pages we're about to
> read to finish. But, reading XLogCtl->LogwrtRqst.Write requires either
> XLogCtl->info_lck spinlock or WALWriteLock. Maybe turn
> XLogCtl->LogwrtRqst.Write a 64-bit atomic and read it without any
> lock, rely on
> WaitXLogInsertionsToFinish()'s return value i.e. if
> WaitXLogInsertionsToFinish() returns a value >= Insert->CurrBytePos,
> then go read that page from WAL buffers.

In the attached v17 patch, I've ensured that the XLogReadFromBuffers
waits for all in-progress insertions to finish when the caller
requests WAL that's past the current write position and before the
current insert position.

I've also ensured that the XLogReadFromBuffers returns special return
codes for various scenarios (when asked to read in recovery, read on a
different TLI, read a non-existent WAL and so on.) instead of it
erroring out. This gives flexibility to the caller to decide what to
do.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v17-0001-Use-64-bit-atomics-for-xlblocks-array-elements.patch application/octet-stream 7.2 KB
v17-0002-Allow-WAL-reading-from-WAL-buffers.patch application/octet-stream 13.1 KB
v17-0003-Add-test-module-for-verifying-WAL-read-from-WAL-.patch application/octet-stream 9.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Erik Rijkers 2023-12-07 10:36:38 Re: remaining sql/json patches
Previous Message Peter Eisentraut 2023-12-07 10:26:10 Re: Reducing output size of nodeToString