Quick Links

Re: Speeding up INSERTs and UPDATEs to partitioned tables

From:	David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To:	Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Cc:	PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: Speeding up INSERTs and UPDATEs to partitioned tables
Date:	2018-07-15 08:34:32
Message-ID:	CAKJS1f_ETz_+oCAieQH4c1jtvFdsr4MBi5R20Uo+7dnTivaS8A@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 13 July 2018 at 20:20, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> Why don't we abandon the notion altogether that
> ExecSetupPartitionTupleRouting *has to* process the whole partition tree?

[...]

> I implemented that idea in the attached patch, which applies on top of
> your 0001 patch, but I'd say it's too big to be just called a delta. I
> was able to get following performance numbers using the following pgbench
> test:

Thanks for looking at this. I like that your idea gets rid of the
indexes cache in syscache. I was not very happy with that part.

I've looked over the code and the ExecUseUpdateResultRelForRouting()
function is broken. Your while loop only skips partitions for the
current partitioned table, it does not skip ModifyTable subnodes that
belong to other partitioned tables.

You can use the following. The code does not find the t1_a2 subnode.

create table t1 (a int, b int) partition by list(a);
create table t1_a1 partition of t1 for values in(1) partition by list(b);
create table t1_a2 partition of t1 for values in(2);
create table t1_a1_b1 partition of t1_a1 for values in(1);
create table t1_a1_b2 partition of t1_a1 for values in(2);
insert into t1 values(2,2);

update t1 set a = a;

I think there might not be enough information to make this work
correctly, as if you change the loop to skip subnodes, then it won't
work in cases where the partition[0] was pruned.

I've another patch sitting here, partly done, that changes
pg_class.relispartition into pg_class.relpartitionparent. If we had
that then we could code your loop to work correctly. Alternatively,
I guess we could just ignore the UPDATE's ResultRelInfos and just
build new ones. Unsure if there's actually a reason we need to reuse
the existing ones, is there?

I think you'd need to know the owning partition and skip subnodes that
don't belong to pd->reldesc. Alternatively, a hashtable could be built
with all the oids belonging to pd->reldesc, then we could loop over
the update_rris finding subnodes that can be found in the hashtable.
Likely this will be much slower than the sort of merge lookup that the
previous code did.

Another thing that I don't like is the PARTITION_ROUTING_MAXSIZE code.
The code seems to assume that there can only be at the most 65536
partitions, but I don't think there's any code which restricts us to
that. There is code in the planner that will bork when trying to
create a RangeTblEntry up that high, but as far as I know that won't
be noticed on the INSERT path. I don't think this code has any
business knowing what the special varnos are set to either. It would
be better to just remove the limit and suffer the small wasted array
space. I understand you've probably coded it like this due to the
similar code that was in my patch, but with mine I knew the total
number of partitions. Your patch does not.

Other thoughts on the patch:

I wonder if it's worth having syscache keep a count on the number of
sub-partitioned tables a partition has. If there are none in the root
partition then the partition_dispatch_info can be initialized with
just 1 element to store the root details. Although, maybe it's not
worth it to reduce the array size by 7 elements.

Also, I'm a bit confused why you change the comments in
execPartition.h for PartitionTupleRouting to be inline again. I
brought those out of line as I thought the complexity of the code
warranted that. You're inlining them again goes against what all the
other structs do in that file.

Apart from that, I think the idea is promising. We'll just need to
find a way to make ExecUseUpdateResultRelForRouting work correctly.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Re: Speeding up INSERTs and UPDATEs to partitioned tables at 2018-07-13 08:20:47 from Amit Langote

Responses

Re: Speeding up INSERTs and UPDATEs to partitioned tables at 2018-07-18 08:29:19 from Amit Langote

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Dean Rasheed	2018-07-15 09:36:11	Re: [HACKERS] PATCH: multivariate histograms and MCV lists
Previous Message	Tomas Vondra	2018-07-14 23:38:12	Re: [HACKERS] plpgsql - additional extra checks