Re: Speeding up INSERTs and UPDATEs to partitioned tables

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Speeding up INSERTs and UPDATEs to partitioned tables
Date: 2018-11-04 10:07:09
Message-ID: CAKJS1f_65UGE1A6DA8vvyd1=-iNGwk67wWOwtPyw5OC=GOUyww@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 1 November 2018 at 22:39, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> On 2018/11/01 10:30, David Rowley wrote:
>> It's great to know the patch is now so perfect that we've only the
>> macro naming left to debate ;-)
>
> I looked over v12 again and noticed a couple minor issues.
>
> + * table then we store the index into parenting
> + * PartitionTupleRouting 'partition_dispatch_info' array. An
>
> s/PartitionTupleRouting/PartitionTupleRouting's/g
>
> Also, I got a bit concerned about "parenting". Does it mean something
> like "enclosing", because the PartitionDispatch is a member of
> PartitionTupleRouting? I got concerned because using "parent" like this
> may be confusing as this is the partitioning code we're talking about,
> where "parent" is generally used to mean "parent" table.
>
> + * the partitioned table that's the target of the command. If we must
> + * route tuple via some sub-partitioned table, then the PartitionDispatch
> + * for those is only built the first time it's required.
>
> ... via some sub-partitioned table"s"
>
> Or perhaps rewrite a bit as:
>
> If we must route the tuple via some sub-partitioned table, then its
> PartitionDispatch is built the first time it's required.

I've attached v13 which hopefully addresses these.

> The macro naming discussion got me thinking today about the macro itself.
> It encapsulates access to the various PartitionTupleRouting arrays
> containing the maps, but maybe we've got the interface of tuple routing a
> bit (maybe a lot given this thread!) wrong to begin with. Instead of
> ExecFindPartition returning indexes into various PartitionTupleRouting
> arrays and its callers then using those indexes to fetch various objects
> from those arrays, why doesn't it return those objects itself? Although
> we made sure that the callers don't need to worry about the meaning of
> these indexes changing with this patch, it still seems a bit odd for them
> to have to go back to those arrays to get various objects.
>
> How about we change ExecFindPartition's interface so that it returns the
> ResultRelInfo, the two maps, and the partition slot? So, the arrays
> simply become a cache for ExecFindPartition et al and are no longer
> accessed outside execPartition.c. Although that makes the interface of
> ExecFindPartition longer, I think it reduces overall complexity.

I don't really think stuffing values into a bunch of output parameters
is much of an improvement. I'd rather allow callers to fetch what they
need using the index we return. Most callers don't need to know about
the child to parent maps, so it seems nicer for those places not to
have to invent a dummy variable to pass along to ExecFindPartition()
so it can needlessly populate it for them.

Perhaps a better design would be to instead of having random special
partitioned-table-only fields in ResultRelInfo, just have an extra
struct there that contains the extra partition information which would
include the translation maps and then just return the ResultRelInfo
and allow callers to lookup any extra details they require. I've not
looked into this in detail, but I think the committer work that's
required for the patch as it is today is already quite significant.
I'm not keen on warding any willing one off by making the commit job
any harder. I agree that it would be good to stabilise the API for
all this partitioning code sometime, but I don't believe it needs to
be done all in one commit. My intent here is to improve performance or
INSERT and UPDATE on partitioned tables. Perhaps we can shape some API
redesign later in the release cycle. What do you think?

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
v13-0001-Speed-up-INSERT-and-UPDATE-on-partitioned-tables.patch application/octet-stream 64.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2018-11-04 10:18:33 Re: pg_dump multi VALUES INSERT
Previous Message John Naylor 2018-11-04 08:26:34 Re: WIP: Avoid creation of the free space map for small tables