Re: New Table Access Methods for Multi and Single Inserts

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Luc Vlaming <luc(at)swarm64(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Subject: Re: New Table Access Methods for Multi and Single Inserts
Date: 2024-03-19 05:09:36
Message-ID: CAD21AoD97mhzF8cqsd2v1jg9z8xfvAJrPx6Wvi+Ev0Hmu96LJA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Fri, Mar 8, 2024 at 7:37 PM Bharath Rupireddy
<bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
>
> On Sat, Mar 2, 2024 at 12:02 PM Bharath Rupireddy
> <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
> >
> > On Mon, Jan 29, 2024 at 5:16 PM Bharath Rupireddy
> > <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
> > >
> > > > Please find the attached v9 patch set.
> >
> > I've had to rebase the patches due to commit 874d817, please find the
> > attached v11 patch set.
>
> Rebase needed. Please see the v12 patch set.
>

I've not reviewed the patches in depth yet, but run performance tests
for CREATE MATERIALIZED VIEW. The test scenarios is:

-- setup
create unlogged table test (c int);
insert into test select generate_series(1, 10000000);

-- run
create materialized view test_mv as select * from test;

Here are the results:

* HEAD
3775.221 ms
3744.039 ms
3723.228 ms

* v12 patch
6289.972 ms
5880.674 ms
7663.509 ms

I can see performance regressions and the perf report says that CPU
spent most time on extending the ResourceOwner's array while copying
the buffer-heap tuple:

- 52.26% 0.18% postgres postgres [.] intorel_receive
52.08% intorel_receive
table_multi_insert_v2 (inlined)
- heap_multi_insert_v2
- 51.53% ExecCopySlot (inlined)
tts_buffer_heap_copyslot
tts_buffer_heap_store_tuple (inlined)
- IncrBufferRefCount
- ResourceOwnerEnlarge
ResourceOwnerAddToHash (inlined)

Is there any reason why we copy a buffer-heap tuple to another
buffer-heap tuple? Which results in that we increments the buffer
refcount and register it to ResourceOwner for every tuples. I guess
that the destination tuple slot is not necessarily a buffer-heap, and
we could use VirtualTupleTableSlot instead. It would in turn require
copying a heap tuple. I might be missing something but it improved the
performance at least in my env. The change I made was:

- dstslot = table_slot_create(state->rel, NULL);
+ //dstslot = table_slot_create(state->rel, NULL);
+ dstslot = MakeTupleTableSlot(RelationGetDescr(state->rel),
+ &TTSOpsVirtual);
+

And the execution times are:
1588.984 ms
1591.618 ms
1582.519 ms

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrei Lepikhov 2024-03-19 05:16:59 Re: POC, WIP: OR-clause support for indexes
Previous Message Jeff Davis 2024-03-19 04:58:57 Re: Be strict when request to flush past end of WAL in WaitXLogInsertionsToFinish