Re: Improve WALRead() to suck data directly from WAL buffers when possible

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Subject: Re: Improve WALRead() to suck data directly from WAL buffers when possible
Date: 2024-02-14 01:29:47
Message-ID: 381ee9523ac78994ca6a8c2b3795cd303c99cebc.camel@j-davis.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Attached 2 patches.

Per Andres's suggestion, 0001 adds an:
Assert(startptr + count <= LogwrtResult.Write)

Though if we want to allow the caller (e.g. in an extension) to
determine the valid range, perhaps using WaitXLogInsertionsToFinish(),
then the check is wrong. Maybe we should just get rid of that code
entirely and trust the caller to request a reasonable range?

On Mon, 2024-02-12 at 17:33 -0800, Jeff Davis wrote:
> That makes me wonder whether my previous idea[1] might matter: when
> some buffers have been evicted, should WALReadFromBuffers() keep
> going
> through the loop and return the end portion of the requested data
> rather than the beginning?
> [1]
> https://www.postgresql.org/message-id/2b36bf99e762e65db0dafbf8d338756cf5fa6ece.camel@j-davis.com

0002 is to illustrate the above idea. It's a strange API so I don't
intend to commit it in this form, but I think we will ultimately need
to do something like it when we want to replicate unflushed data.

The idea is that data past the Write pointer is always (and only)
available in the WAL buffers, so WALReadFromBuffers() should always
return it. That way we can always safely fall through to ordinary
WALRead(), which can only see before the Write pointer. There's also
data before the Write pointer that could be in the WAL buffers, and we
might as well copy that, too, if it's not evicted.

If some buffers are evicted, it will fill in the *end* of the buffer,
leaving a gap at the beginning. The nice thing is that if there is any
gap, it will be before the Write pointer, so we can always fall back to
WALRead() to fill the gap and it should always succeed.

Regards,
Jeff Davis

Attachment Content-Type Size
0001-Add-assert-to-WALReadFromBuffers.patch text/x-patch 2.2 KB
0002-WALReadFromBuffers-read-end-of-the-requested-range.patch text/x-patch 7.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2024-02-14 01:32:30 Re: Improve WALRead() to suck data directly from WAL buffers when possible
Previous Message Yugo NAGATA 2024-02-14 00:53:45 Re: Small fix on query_id_enabled