Re: refactoring relation extension and BufferAlloc(), faster COPY

From: Andres Freund <andres(at)anarazel(dot)de>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: refactoring relation extension and BufferAlloc(), faster COPY
Date: 2023-02-21 19:22:26
Message-ID: 20230221192226.ugbpx5pkg3d7xv7u@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2023-02-21 18:18:02 +0200, Heikki Linnakangas wrote:
> > v2-0007-bufmgr-Move-relation-extension-handling-into-Bulk.patch
>
> > +static BlockNumber
> > +BulkExtendSharedRelationBuffered(Relation rel,
> > + SMgrRelation smgr,
> > + bool skip_extension_lock,
> > + char relpersistence,
> > + ForkNumber fork, ReadBufferMode mode,
> > + BufferAccessStrategy strategy,
> > + uint32 *num_pages,
> > + uint32 num_locked_pages,
> > + Buffer *buffers)
>
> Ugh, that's a lot of arguments, some are inputs and some are outputs. I
> don't have any concrete suggestions, but could we simplify this somehow?
> Needs a comment at least.

Yea. I think this is the part of the patchset I like the least.

The ugliest bit is accepting both rel and smgr. The background to that is that
we need the relation oid to acquire the extension lock. But during crash
recovery we don't have that - which is fine, because we don't need the
extension lock.

We could have two different of functions, but that ends up a mess as well, as
we've seen in other cases.

> > v2-0008-Convert-a-few-places-to-ExtendRelationBuffered.patch
>
> > diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
> > index de1427a1e0e..1810f7ebfef 100644
> > --- a/src/backend/access/brin/brin.c
> > +++ b/src/backend/access/brin/brin.c
> > @@ -829,9 +829,11 @@ brinbuild(Relation heap, Relation index, IndexInfo *indexInfo)
> > * whole relation will be rolled back.
> > */
> > - meta = ReadBuffer(index, P_NEW);
> > + meta = ExtendRelationBuffered(index, NULL, true,
> > + index->rd_rel->relpersistence,
> > + MAIN_FORKNUM, RBM_ZERO_AND_LOCK,
> > + NULL);
> > Assert(BufferGetBlockNumber(meta) == BRIN_METAPAGE_BLKNO);
> > - LockBuffer(meta, BUFFER_LOCK_EXCLUSIVE);
> > brin_metapage_init(BufferGetPage(meta), BrinGetPagesPerRange(index),
> > BRIN_CURRENT_VERSION);
>
> Since we're changing the API anyway, how about introducing a new function
> for this case where we extend the relation but we what block number we're
> going to get? This pattern of using P_NEW and asserting the result has
> always felt awkward to me.

To me it always felt like a code smell that some code insists on specific
getting specific block numbers with P_NEW. I guess it's ok for things like
building a new index, but outside of that it feels wrong.

The first case I found just now is revmap_physical_extend(). Which seems to
extend the relation while holding an lwlock. Ugh.

Maybe ExtendRelationBufferedTo() or something like that? With a big comment
saying that users of it are likely bad ;)

> > - buf = ReadBuffer(irel, P_NEW);
> > + buf = ExtendRelationBuffered(irel, NULL, false,
> > + irel->rd_rel->relpersistence,
> > + MAIN_FORKNUM, RBM_ZERO_AND_LOCK,
> > + NULL);
>
> These new calls are pretty verbose, compared to ReadBuffer(rel, P_NEW). I'd
> suggest something like:

I guess. Not sure if it's worth optimizing for brevity all that much here -
there's not that many places extending relations. Several places end up with
less code, actually , because they don't need to care about the extension lock
themselves anymore. I think an ExtendBuffer() that doesn't mention the fork,
etc, ends up being more confusing than helpful.

> buf = ExtendBuffer(rel);

Without the relation in the name it just seems confusing to me - the extension
isn't "restricted" to shared_buffers. ReadBuffer() isn't great as a name
either, but it makes a bit more sense at least, it reads into a buffer. And
it's a vastly more frequent operation, so optimizing for density is worth it.

> Do other ReadBufferModes than RBM_ZERO_AND_LOCK make sense with
> ExtendRelationBuffered?

Hm. That's a a good point. Probably not. Perhaps it could be useful to support
RBM_NORMAL as well? But even if, it'd just be a lock release away if we always
used RBM_ZERO_AND_LOCK.

> Is it ever possible to call this without a relcache entry? WAL redo
> functions do that with ReadBuffer, but they only extend a relation
> implicitly, by replay a record for a particular block.

I think we should use it for crash recovery as well, but the patch doesn't
yet. We have some gnarly code there, see the loop using P_NEW in
XLogReadBufferExtended(). Extending the file one-by-one is a lot more
expensive than doing it in bulk.

> All of the above comments are around the BulkExtendRelationBuffered()
> function's API. That needs a closer look and a more thought-out design to
> make it nice. Aside from that, this approach seems valid.

Thanks for looking! I agree that it can stand a fair bit of polishing...

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2023-02-21 20:35:55 Re: [PATCH] Fix unbounded authentication exchanges during PQconnectPoll()
Previous Message Tom Lane 2023-02-21 19:13:08 Re: Commitfest Manager