Re: New Table Access Methods for Multi and Single Inserts

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Luc Vlaming <luc(at)swarm64(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Paul Guo <guopa(at)vmware(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Subject: Re: New Table Access Methods for Multi and Single Inserts
Date: 2023-08-01 16:30:00
Message-ID: CALj2ACX5UMWVFdrRNUE0KDrg54WV1cumBXwcETXhrPc1ibKAQA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jun 4, 2023 at 4:08 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> Hi,
>
> This patch was referenced in a discussion at pgcon, so I thought I'd give it a
> look, even though Bharat said that he won't have time to drive it forward...

Thanks. Finally, I started to spend time on this. Just curious - may
I know the discussion in/for which this patch is referenced? What was
the motive? Is it captured somewhere?

> On 2021-04-19 10:21:36 +0530, Bharath Rupireddy wrote:
> > + .tuple_insert_begin = heap_insert_begin,
> > + .tuple_insert_v2 = heap_insert_v2,
> > + .multi_insert_v2 = heap_multi_insert_v2,
> > + .multi_insert_flush = heap_multi_insert_flush,
> > + .tuple_insert_end = heap_insert_end,
>
> I don't think we should have multiple callback for the insertion APIs in
> tableam.h. I think it'd be good to continue supporting the old table_*()
> functions, but supporting multiple insert APIs in each AM doesn't make much
> sense to me.

I named these new functions XXX_v2 for compatibility reasons. Because,
it's quite possible for external modules to use existing
table_tuple_insert, table_multi_insert functions. If we were to change
the existing insert tableams, all the external modules using them
would have to change their code, is that okay?

> > +/*
> > + * GetTupleSize - Compute the tuple size given a table slot.
> > +inline Size
>
> I think this embeds too much knowledge of the set of slot types in core
> code. I don't see why it's needed either?

The heapam multi-insert implementation needs to know the tuple size
from the slot to decide whether or not to flush the tuples from the
buffers. I couldn't find a direct way then to know the tuple size from
the slot, so added that helper function. With a better understanding
now, I think we can rely on the memory allocated for TupleTableSlot's
tts_mcxt. While this works for the materialized slots passed in to the
insert functions, for non-materialized slots the flushing decision can
be solely on the number of tuples stored in the buffers. Another way
is to add a get_tuple_size callback to TupleTableSlotOps and let the
tuple slot providers give us the tuple size.

> > diff --git a/src/include/access/tableam.h b/src/include/access/tableam.h
> > index 414b6b4d57..2a1470a7b6 100644
> > --- a/src/include/access/tableam.h
> > +++ b/src/include/access/tableam.h
> > @@ -229,6 +229,32 @@ typedef struct TM_IndexDeleteOp
> > TM_IndexStatus *status;
> > } TM_IndexDeleteOp;
> >
> > +/* Holds table insert state. */
> > +typedef struct TableInsertState
>
> I suspect we should design it to be usable for updates and deletes in the
> future, and thus name it TableModifyState.

There are different parameters that insert/update/delete would want to
pass across in the state. So, having Table{Insert/Update/Delete}State
may be a better idea than having the unneeded variables lying around
or having a union and state_type as INSERT/UPDATE/DELETE, no? Do you
have a different thought here?

> I think we should instead have a generic TableModifyState, which each AM then
> embeds into an AM specific AM state. Forcing two very related structs to be
> allocated separately doesn't seem wise in this case.

The v7 patches have largely changed the way these options and
parameters are passed, please have a look.

> > +{
> > + Relation rel;
> > + /* Bulk insert state if requested, otherwise NULL. */
> > + struct BulkInsertStateData *bistate;
> > + CommandId cid;
>
> Hm - I'm not sure it's a good idea to force the cid to be the same for all
> inserts done via one TableInsertState.

If required, someone can always pass a new CID before every
tuple_insert_v2/tuple_multi_insert_v2 call via TableInsertState. Isn't
it sufficient?

> > @@ -1430,6 +1473,50 @@ table_multi_insert(Relation rel, TupleTableSlot **slots, int nslots,
> > cid, options, bistate);
> > }
> >
> > +static inline TableInsertState*
> > +table_insert_begin(Relation rel, CommandId cid, int options,
> > + bool alloc_bistate, bool is_multi)
>
> Why have alloc_bistate and options?

"alloc_bistate" is for the caller to specify if they need a bulk
insert state or not. "options" is for the caller to specify if they
need table_tuple_insert performance options such as
TABLE_INSERT_SKIP_FSM, TABLE_INSERT_FROZEN, TABLE_INSERT_NO_LOGICAL.
The v7 patches have changed the way these options and parameters are
passed, please have a look.

> > +static inline void
> > +table_insert_end(TableInsertState *state)
> > +{
> > + /* Deallocate bulk insert state here, since it's AM independent. */
> > + if (state->bistate)
> > + FreeBulkInsertState(state->bistate);
> > +
> > + state->rel->rd_tableam->tuple_insert_end(state);
> > +}
>
> Seems like the order in here should be swapped?

Right. It looks like BulkInsertState is for heapam, it really doesn't
have to be in table_XXX functions, hence it all the way down to
heap_insert_XXX functions.

I'm attaching the v7 patch set with the above review comments
addressed. My initial idea behind these new insert APIs was the
ability to re-use the multi insert code in COPY for CTAS and REFRESH
MATERIALIZED VIEW. I'm open to more thoughts here.

The v7 patches have largely changed the way state structure (heapam
specific things are moved all the way down to heapam.c) is defined,
the parameters are passed, and simplified the multi insert logic a
lot.

0001 - introduces new single and multi insert table AM and heapam
implementation of the new AM.
0002 - optimizes CREATE TABLE AS to use the new multi inserts table AM
making it faster by 2.13X or 53%.
0003 - optimizes REFRESH MATERIALIZED VIEW to use the new multi
inserts table AM making it faster by 1.52X or 34%.
0004 - uses the new multi inserts table AM for COPY FROM - I'm yet to
spend time on this, I'll share the patch when ready.

Thoughts?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v7-0001-New-table-AMs-for-single-and-multi-inserts.patch application/octet-stream 13.7 KB
v7-0002-Optimize-CTAS-with-multi-inserts.patch application/octet-stream 2.8 KB
v7-0003-Optimize-RMV-with-multi-inserts.patch application/octet-stream 3.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2023-08-01 16:37:07 Re: should frontend tools use syncfs() ?
Previous Message Konstantin Knizhnik 2023-08-01 16:00:10 One more problem with JIT