non-bulk inserts and tuple routing

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: non-bulk inserts and tuple routing
Date: 2017-12-19 10:06:16
Message-ID: 8975331d-d961-cbdd-f862-fdd3d97dc2d0@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi.

I have a patch that rearranges the code around partition tuple-routing,
such that allocation of per-partition objects (ResultRelInfo,
TupleConversionMap, etc.) is delayed until a given partition is actually
inserted into (i.e., a tuple is routed to it). I can see good win for
non-bulk inserts with the patch and the patch is implemented such that it
doesn't affect the bulk-insert case much.

Performance numbers:

* Uses following hash-partitioned table:

create table t1 (a int, b int) partition by hash (a);
create table t1_x partition of t1 for values with (modulus M, remainder R)
...

* Non-bulk insert uses the following code (insert 100,000 rows one-by-one):

do $$
begin
for i in 1..100000 loop
insert into t1 values (i, i+1);
end loop;
end; $$;

* Times in milliseconds:

#parts HEAD Patched

8 6216.300 4977.670
16 9061.388 6360.093
32 14081.656 8752.405
64 24887.110 13919.384
128 45926.251 24582.411
256 88088.084 45490.894

As you can see the performance can be as much as 2x faster with the patch,
although time taken still increases as the number of partitions increases,
because we still lock *all* partitions at the beginning.

* Bulk-inserting 100,000 rows using COPY:

copy t1 from '/tmp/t1.csv' csv;

* Times in milliseconds:

#parts HEAD Patched

8 458.301 450.875
16 409.271 510.723
32 500.960 612.003
64 430.687 795.046
128 449.314 565.786
256 493.171 490.187

Not much harm here, although numbers are a bit noisy.

Patch is divided into 4, first 3 of which are refactoring patches.

I know this patch will conflict severely with [1] and [2], so it's fine if
we consider applying these later. Will add this to next CF.

Thanks,
Amit

[1] https://commitfest.postgresql.org/16/1023/

[2] https://commitfest.postgresql.org/16/1184/

Attachment Content-Type Size
0001-Teach-CopyFrom-to-use-ModifyTableState-for-tuple-rou.patch text/plain 8.1 KB
0002-ExecFindPartition-refactoring.patch text/plain 3.9 KB
0003-ExecSetupPartitionTupleRouting-refactoring.patch text/plain 9.1 KB
0004-During-tuple-routing-initialize-per-partition-object.patch text/plain 16.7 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2017-12-19 10:12:10 Re: non-bulk inserts and tuple routing
Previous Message Amit Kapila 2017-12-19 10:01:05 Re: [HACKERS] parallel.c oblivion of worker-startup failures