Re: Speeding up INSERTs and UPDATEs to partitioned tables

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Speeding up INSERTs and UPDATEs to partitioned tables
Date: 2018-11-08 07:15:43
Message-ID: 38a63253-5027-e690-74f7-154d0b6ef5b0@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2018/11/07 20:46, David Rowley wrote:
> On 5 November 2018 at 20:17, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>> On 2018/11/04 19:07, David Rowley wrote:
>>> Perhaps a better design would be to instead of having random special
>>> partitioned-table-only fields in ResultRelInfo, just have an extra
>>> struct there that contains the extra partition information which would
>>> include the translation maps and then just return the ResultRelInfo
>>> and allow callers to lookup any extra details they require.
>>
>> IIUC, you're saying that we could introduce a new struct that contains
>> auxiliary information needed by partition pruning (maps, slot, etc. for
>> tuple conversion) and add a new ResultRelInfo member of that struct type.
>> That way, there is no need to return them separately or return an index to
>> access them from their arrays. I guess we won't even need the arrays we
>> have now. I think that might be a good idea and simplifies things
>> significantly.
>
> I've attached a patch which does this.

Thank you for updating the patch this way.

> It adds a new struct named
> PartitionRoutingInfo into ResultRelInfo and pulls 3 of the 4 arrays
> out of PartitionTupleRouting. There are fields for each of what these
> arrays used to store inside the PartitionRoutingInfo struct.
>
> While doing this I kept glancing back over at ModifyTableState and at
> the mt_per_subplan_tupconv_maps array. I wondered if it would be
> better to make the PartitionRoutingInfo a bit more generic, perhaps
> call it TupleConversionInfo and have fields like ti_ToGeneric and
> ti_FromGeneric, with the idea that "generic" would be the root
> partition or the first subplan for inheritance updates. This would
> allow us to get rid of a good chunk of code inside nodeModifyTable.c.
> However, I ended up not doing this and left PartitionRoutingInfo to be
> specifically for Root to Partition conversion.

I think it's good that you left mt_per_subplan_tupconv_maps out of this.
UPDATE tuple routing can be said to have two steps: first step, a tiny
one, converts the tuple that needs to be routed from the source
partition's rowtype to the root's rowtype so that tuple routing proper can
begin, and second is the actual tuple routing carried out using
PartitionTupleRouting. The first step is handled by nodeModifyTable.c and
so any data structures related to it should be in ModifyTableState.

Actually, as I also proposed upthread, we should move root_tuple_slot from
PartitionTupleRouting to ModifyTableState as mt_root_tuple_slot, because
it's part of the first step described above that has nothing to do with
partition tuple routing proper. We can make PartitionTupleRouting private
to execPartition.c if we do that.

> Also, on the topic about what to name the conversion maps from a few
> days ago; After looking at this a bit more I decided that having them
> named ParentToChild and ChildToParent is misleading. If the child is
> the child of some sub-partitioned table then the parent that the map
> is talking about is not the partition's parent, but the root
> partitioned table. So really RootToPartition and PartitionToRoot seem
> like much more accurate names for the maps.

Makes sense. :)

> I made a few other changes along the way; I thought that
> ExecFindPartition() would be a good candidate to take on the
> responsibility of validating the partition is valid for INSERTs when
> it uses a partition out of the subplan_resultrel_hash. I thought it
> was better to check this once when we're in the code path of grabbing
> the ResultRelInfo out that hash table rather than in a code path that
> must check if it's been done already each time we route a tuple into a
> partition. It also allowed me to get rid of
> ri_PartitionReadyForRouting.

Ah, I too had tried to remove ri_PartitionReadyForRouting, but had to give
up on that idea because I didn't think of moving steps that are needed to
be performed before setting it to true to that block of code in
ExecFindPartition.

> I also moved the call to
> ExecInitRoutingInfo() into ExecFindPartition() which allowed me to
> make that function static, which could result in the generation of
> slightly more optimal compiled code.
>
> Please find attached the v14 patch.
>
> Rather nicely git --stat reports a net negative additional code (with
> the 48 lines of new tests included)
>
> 11 files changed, 632 insertions(+), 660 deletions(-)

That's nice!

I didn't find anything quite significant to complain about, except just
one line:

+ * Initially we must only setup 1 PartitionDispatch object; the one for

setup -> set up

Thanks,
Amit

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2018-11-08 07:27:46 Re: file cloning in pg_upgrade and CREATE DATABASE
Previous Message John Naylor 2018-11-08 07:02:24 Re: PostgreSQL Limits and lack of documentation about them.