Re: refactoring relation extension and BufferAlloc(), faster COPY

From: Andres Freund <andres(at)anarazel(dot)de>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, vignesh C <vignesh21(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: refactoring relation extension and BufferAlloc(), faster COPY
Date: 2023-03-01 07:33:18
Message-ID: 20230301073318.56klnz6t6ynlw2cm@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2023-02-21 17:33:31 +0100, Alvaro Herrera wrote:
> On 2023-Feb-21, Heikki Linnakangas wrote:
>
> > > +static BlockNumber
> > > +BulkExtendSharedRelationBuffered(Relation rel,
> > > + SMgrRelation smgr,
> > > + bool skip_extension_lock,
> > > + char relpersistence,
> > > + ForkNumber fork, ReadBufferMode mode,
> > > + BufferAccessStrategy strategy,
> > > + uint32 *num_pages,
> > > + uint32 num_locked_pages,
> > > + Buffer *buffers)
> >
> > Ugh, that's a lot of arguments, some are inputs and some are outputs. I
> > don't have any concrete suggestions, but could we simplify this somehow?
> > Needs a comment at least.
>
> Yeah, I noticed this too. I think it would be easy enough to add a new
> struct that can be passed as a pointer, which can be stack-allocated
> by the caller, and which holds the input arguments that are common to
> both functions, as is sensible.

I played a fair bit with various options. I ended up not using a struct to
pass most options, but instead go for a flags argument. However, I did use a
struct for passing either relation or smgr.

typedef enum ExtendBufferedFlags {
EB_SKIP_EXTENSION_LOCK = (1 << 0),
EB_IN_RECOVERY = (1 << 1),
EB_CREATE_FORK_IF_NEEDED = (1 << 2),
EB_LOCK_FIRST = (1 << 3),

/* internal flags follow */
EB_RELEASE_PINS = (1 << 4),
} ExtendBufferedFlags;

/*
* To identify the relation - either relation or smgr + relpersistence has to
* be specified. Used via the EB_REL()/EB_SMGR() macros below. This allows us
* to use the same function for both crash recovery and normal operation.
*/
typedef struct ExtendBufferedWhat
{
Relation rel;
struct SMgrRelationData *smgr;
char relpersistence;
} ExtendBufferedWhat;

#define EB_REL(p_rel) ((ExtendBufferedWhat){.rel = p_rel})
/* requires use of EB_SKIP_EXTENSION_LOCK */
#define EB_SMGR(p_smgr, p_relpersistence) ((ExtendBufferedWhat){.smgr = p_smgr, .relpersistence = p_relpersistence})

extern Buffer ExtendBufferedRel(ExtendBufferedWhat eb,
ForkNumber forkNum,
BufferAccessStrategy strategy,
uint32 flags);
extern BlockNumber ExtendBufferedRelBy(ExtendBufferedWhat eb,
ForkNumber fork,
BufferAccessStrategy strategy,
uint32 flags,
uint32 extend_by,
Buffer *buffers,
uint32 *extended_by);
extern Buffer ExtendBufferedRelTo(ExtendBufferedWhat eb,
ForkNumber fork,
BufferAccessStrategy strategy,
uint32 flags,
BlockNumber extend_to,
ReadBufferMode mode);

As you can see I removed ReadBufferMode from most of the functions (as
suggested by Heikki earlier). When extending by 1/multiple pages, we only need
to know whether to lock or not.

The reason ExtendBufferedRelTo() has a 'mode' argument is that that allows to
fall back to reading page normally if there was a concurrent relation
extension.

The reason EB_CREATE_FORK_IF_NEEDED exists is to remove the duplicated,
gnarly, code to do so from vm_extend(), fsm_extend().

I'm not sure about the function naming pattern. I do like 'By' a lot more than
the Bulk prefix I used before.

What do you think?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Harinath Kanchu 2023-03-01 08:06:45 Re: LOG: invalid record length at <LSN> : wanted 24, got 0
Previous Message Bharath Rupireddy 2023-03-01 07:30:00 Combine pg_walinspect till_end_of_wal functions with others