Re: Speeding up INSERTs and UPDATEs to partitioned tables

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Speeding up INSERTs and UPDATEs to partitioned tables
Date: 2018-08-03 05:58:03
Message-ID: 765091db-4931-ed5e-d1ca-b80ae1b03b85@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

(looking at the v5 patch but replying to an older email)

On 2018/07/31 16:03, David Rowley wrote:
> I've attached a complete v4 patch.
>
>> By the way, when going over the updated code, I noticed that the code
>> around child_parent_tupconv_maps could use some refactoring too.
>> Especially, I noticed that ExecSetupChildParentMapForLeaf() allocates
>> child-to-parent map array needed for transition tuple capture even if not
>> needed by any of the leaf partitions. I'm attaching here a patch that
>> applies on top of your v3 to show what I'm thinking we could do.
>
> Maybe we can do that as a follow-on patch.

We probably could, but I think it would be a good idea get rid of *all*
redundant allocations due to tuple routing in one patch, if that's the
mission of this thread and the patch anyway.

> I think what we have so far
> is already ended up quite complex to review. What do you think?

Yeah, it's kind of complex, but at least it seems that we're clear on the
point that what we're trying to do here is to try to get rid of redundant
allocations.

Parts of the patch that appear complex seems to be around the allocation
of various maps. Especially the child-to-parent maps, which as things
stand today, come from two arrays -- a per-update-subplan array that's
needed by update tuple routing proper and per-leaf partition array (one in
PartitionTupleRouting) that's needed by transition capture machinery. The
original coding was such the update tuple routing handling code would try
to avoid allocating the per-update-subplan array if it saw that per-leaf
partition array was already set up in PartitionTupleRouting, because
transition capture is active in the query. For update-tuple-routing code
to be able to use maps from the per-leaf array, it would have to know
which update-subplans mapped to which tuple-routing-initialized
partitions. That was maintained in the subplan_partition_offset array
that's now gone with this patch, because we no longer want to fix the
tuple-routing-initialized-partition offsets in advance. So, it's better
to dissociate per-subplan maps which are initialized during
ExecInitModifyTable from per-leaf maps which are initialized lazily when
tuple routing initializes a partition, which is what my portion of the
patch did.

As mentioned in my last email, I still think it would be a good idea to
simplify the handling of child-to-parent maps in PartitionTupleRouting
even further, while we're at improving the code in this area. I revised
the patch such that it makes the handling of maps in PartitionTupleRouting
even more uniform. With that patch, we no longer have two completely
unrelated places in the code managing parent-to-child and child-to-parent
maps, even though both arrays are in the same PartitionTupleRouting.
Please find the updated patch attached with this email.

Thanks,
Amit

Attachment Content-Type Size
v2-0002-Refactor-handling-of-child_parent_tupconv_maps.patch text/plain 12.3 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Fetter 2018-08-03 06:09:19 Re: [PATCH] pg_hba.conf : new auth option : clientcert=verify-full
Previous Message Kyotaro HORIGUCHI 2018-08-03 04:59:51 Re: [HACKERS] Restricting maximum keep segments by repslots