Quick Links

Re: Introduce new multi insert Table AM and improve performance of various SQL commands with it for Heap AM

From:	Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To:	Jeff Davis <pgsql(at)j-davis(dot)com>
Cc:	Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Luc Vlaming <luc(at)swarm64(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Subject:	Re: Introduce new multi insert Table AM and improve performance of various SQL commands with it for Heap AM
Date:	2024-04-29 06:06:20
Message-ID:	CALj2ACWTrx1zxWvq8Uj2rEwCsDgQHeJ53WdvzZUw3kW+_VPG6A@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Thu, Apr 25, 2024 at 10:11 PM Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
>
> On Wed, 2024-04-24 at 18:19 +0530, Bharath Rupireddy wrote:
> > I added a flush callback named TableModifyBufferFlushCallback; when
> > provided by callers invoked after tuples are flushed to disk from the
> > buffers but before the AM frees them up. Index insertions and AFTER
> > ROW INSERT triggers can be executed in this callback. See the v19-
> > 0001 patch for how AM invokes the flush callback, and see either v19-
> > 0003 or v19-0004 or v19-0005 for how a caller can supply the callback
> > and required context to execute index insertions and AR triggers.
>
> The flush callback takes a pointer to an array of slot pointers, and I
> don't think that's the right API. I think the callback should be called
> on each slot individually.
>
> We shouldn't assume that a table AM stores buffered inserts as an array
> of slot pointers. A TupleTableSlot has a fair amount of memory overhead
> (64 bytes), so most AMs wouldn't want to pay that overhead for every
> tuple. COPY does, but that's because the number of buffered tuples is
> fairly small.

I get your point. An AM can choose to implement the buffering strategy
by just storing the plain tuple rather than the tuple slots in which
case the flush callback with an array of tuple slots won't work.
Therefore, I now changed the flush callback to accept only a single
tuple slot.

> > > 11. Deprecate the multi_insert API.
> >
> > I did remove both table_multi_insert and table_finish_bulk_insert in
> > v19-0006.
>
> That's OK with me. Let's leave those functions out for now.

Okay. Dropped the 0006 patch for now.

Please see the attached v20 patch set.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment	Content-Type	Size
v20-0001-Introduce-new-Table-Access-Methods-for-single-an.patch	application/x-patch	20.5 KB
v20-0002-Optimize-CTAS-CMV-RMV-and-TABLE-REWRITES-with-mu.patch	application/x-patch	7.0 KB
v20-0003-Optimize-INSERT-INTO-.-SELECT-with-multi-inserts.patch	application/x-patch	9.2 KB
v20-0004-Optimize-Logical-Replication-apply-with-multi-in.patch	application/x-patch	19.7 KB
v20-0005-Use-new-multi-insert-Table-AM-for-COPY-FROM.patch	application/x-patch	14.4 KB

In response to

Re: Introduce new multi insert Table AM and improve performance of various SQL commands with it for Heap AM at 2024-04-25 16:41:08 from Jeff Davis

Responses

Re: Introduce new multi insert Table AM and improve performance of various SQL commands with it for Heap AM at 2024-05-15 07:26:17 from Bharath Rupireddy

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	shveta malik	2024-04-29 06:08:14	Re: Synchronizing slots from primary to standby
Previous Message	Tom Lane	2024-04-29 05:32:40	Re: A failure in prepared_xacts test