Re: non-bulk inserts and tuple routing

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: non-bulk inserts and tuple routing
Date: 2018-01-19 08:56:03
Message-ID: f3a0d60e-1f60-f2ea-dd6e-89c397e4ecf4@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2017/12/19 19:06, Amit Langote wrote:
> Hi.
>
> I have a patch that rearranges the code around partition tuple-routing,
> such that allocation of per-partition objects (ResultRelInfo,
> TupleConversionMap, etc.) is delayed until a given partition is actually
> inserted into (i.e., a tuple is routed to it). I can see good win for
> non-bulk inserts with the patch and the patch is implemented such that it
> doesn't affect the bulk-insert case much.
>
> Performance numbers:
>
> * Uses following hash-partitioned table:
>
> create table t1 (a int, b int) partition by hash (a);
> create table t1_x partition of t1 for values with (modulus M, remainder R)
> ...
>
>
> * Non-bulk insert uses the following code (insert 100,000 rows one-by-one):
>
> do $$
> begin
> for i in 1..100000 loop
> insert into t1 values (i, i+1);
> end loop;
> end; $$;
>
> * Times in milliseconds:
>
> #parts HEAD Patched
>
> 8 6216.300 4977.670
> 16 9061.388 6360.093
> 32 14081.656 8752.405
> 64 24887.110 13919.384
> 128 45926.251 24582.411
> 256 88088.084 45490.894
>
> As you can see the performance can be as much as 2x faster with the patch,
> although time taken still increases as the number of partitions increases,
> because we still lock *all* partitions at the beginning.
>
> * Bulk-inserting 100,000 rows using COPY:
>
> copy t1 from '/tmp/t1.csv' csv;
>
> * Times in milliseconds:
>
> #parts HEAD Patched
>
> 8 458.301 450.875
> 16 409.271 510.723
> 32 500.960 612.003
> 64 430.687 795.046
> 128 449.314 565.786
> 256 493.171 490.187
>
> Not much harm here, although numbers are a bit noisy.
>
> Patch is divided into 4, first 3 of which are refactoring patches.
>
> I know this patch will conflict severely with [1] and [2], so it's fine if
> we consider applying these later. Will add this to next CF.

I rebased the patches, since they started conflicting with a recently
committed patch [1].

Thanks,
Amit

[1]
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=cc6337d2fed5

Attachment Content-Type Size
v2-0001-Teach-CopyFrom-to-use-ModifyTableState-for-tuple-.patch text/plain 4.5 KB
v2-0002-ExecFindPartition-refactoring.patch text/plain 4.0 KB
v2-0003-ExecSetupPartitionTupleRouting-refactoring.patch text/plain 3.8 KB
v2-0004-During-tuple-routing-initialize-per-partition-obj.patch text/plain 18.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2018-01-19 09:24:56 Re: [HACKERS] [BUGS] Bug in Physical Replication Slots (at least 9.5)?
Previous Message Masahiko Sawada 2018-01-19 08:43:15 Re: Typo in slotfuncs.c