Re: ModifyTable overheads in generic plans

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, David Rowley <dgrowleyml(at)gmail(dot)com>, "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>
Subject: Re: ModifyTable overheads in generic plans
Date: 2021-04-05 13:42:48
Message-ID: CA+HiwqFKAmzK6JykmX8kROzHVWF6L7nT9yCF_p2nxd2M6_QpAA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Apr 5, 2021 at 1:43 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Amit Langote <amitlangote09(at)gmail(dot)com> writes:
> > On Sun, Apr 4, 2021 at 10:20 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> In some desultory performance testing here, it seemed like a
> >> significant part of the cost is ExecOpenIndices, and I don't see
> >> a reason offhand why we could not delay/skip that. I also concur
> >> with delaying construction of ri_ChildToRootMap and the
> >> partition_tuple_routing data structures, since many queries will
> >> never need those at all.
>
> > As I mentioned in [1], creating ri_projectNew can be expensive too,
> > especially as column count (and partition count for the generic plan
> > case) grows. I think we should have an static inline
> > initialize-on-first-access accessor function for that field too.
>
> > Actually, I remember considering having such accessor functions (all
> > static inline) for ri_WithCheckOptionExprs, ri_projectReturning,
> > ri_onConflictArbiterIndexes, and ri_onConflict (needed by ON CONFLICT
> > UPDATE) as well, prompted by Heikki's comments earlier in the
> > discussion. I also remember, before even writing this patch, not
> > liking that WCO and RETURNING expressions are initialized in their own
> > separate loops, rather than being part of the earlier loop that says:
>
> Sure, we might as well try to improve the cosmetics here.
>
> >> Anyway, I think the way to proceed for now is to grab the low-hanging
> >> fruit of things that clearly won't change any semantics. But tail end
> >> of the dev cycle is no time to be making really fundamental changes
> >> in how FDW direct modify works.
>
> > I agree.
>
> OK. Do you want to pull out the bits of the patch that we can still
> do without postponing BeginDirectModify?

I ended up with the attached, whereby ExecInitResultRelation() is now
performed for all relations before calling ExecInitNode() on the
subplan. As mentioned, I moved other per-result-rel initializations
into the same loop that does ExecInitResultRelation(), while moving
code related to some initializations into initialize-on-first-access
accessor functions for the concerned fields. I chose to do that for
ri_WIthCheckOptionExprs, ri_projectReturning, and ri_projectNew.

ExecInitNode() is called on the subplan (to set
outerPlanState(mtstate) that is) after all of the per-result-rel
initializations are done. One of the initializations is calling
BeginForeignModify() for non-direct modifications, an API to which we
currently pass mtstate. Moving that to before setting
outerPlanState(mtstate) so as to be in the same loop as other
initializations had me worried just a little bit given a modification
I had to perform in postgresBeginForeignModify():

@@ -1879,7 +1879,7 @@ postgresBeginForeignModify(ModifyTableState *mtstate,
rte,
resultRelInfo,
mtstate->operation,
- outerPlanState(mtstate)->plan,
+ outerPlan(mtstate->ps.plan),
query,
target_attrs,
values_end_len,

Though I think that this is harmless, because I'd think that the
implementers of this API shouldn't really rely too strongly on
assuming that outerPlanState(mtstate) is valid when it is called, if
postgres_fdw's implementation is any indication.

Another slightly ugly bit is the dependence of direct modify API on
ri_projectReturning being set even if it doesn't care for anything
else in the ResultRelInfo. So in ExecInitModifyTable()
ri_projectReturning initialization is not skipped for
directly-modified foreign result relations.

Notes on regression test changes:

* Initializing WCO quals during execution instead of during
ExecInitNode() of ModifyTable() causes a couple of regression test
changes in updatable_view.out that were a bit unexpected for me --
Subplans that are referenced in WCO quals are no longer shown in the
plain EXPLAIN output. Even though that's a user-visible change, maybe
we can live with that?

* ri_RootResultRelInfo in *all* child relations instead of just in
tuple-routing result relations has caused changes to inherit.out and
privileges.out. I think that's basically down to ExecConstraints() et
al doing one thing for child relations in which ri_RootResultRelInfo
is set and another for those in which it is not. Now it's set in
*all* child relations, so it always does the former thing. I remember
having checked that those changes are only cosmetic when I first
encountered them.

* Moving PartitionTupleRouting initialization to be done lazily for
cross-partition update cases causes changes to update.out. They have
to do with the fact that the violations of the actual target table's
partition constraint are now shown as such, instead of reporting them
as occurring on one of the leaf partitions. Again, only cosmetic.

> Another thing we could consider, perhaps, is keeping the behavior
> the same for foreign tables but postponing init of local ones.
> To avoid opening the relations to figure out which kind they are,
> we'd have to rely on the RTE copies of relkind, which is a bit
> worrisome --- I'm not certain that those are guaranteed to be
> up-to-date --- but it's probably okay since there is no way to
> convert a regular table to foreign or vice versa. Anyway, that
> idea seems fairly messy so I'm inclined to just pursue the
> lower-hanging fruit for now.

It would be nice to try that idea out, but I tend to agree with the last part.

Also, I'm fairly happy with the kind of performance improvement I see
even with the lower-hanging fruit patch for my earlier earlier shared
benchmark that tests the performance of generic plan execution:

HEAD (especially with 86dc90056 now in):

nparts 10cols 20cols 40cols

64 6926 6394 6253
128 3758 3501 3482
256 1938 1822 1776
1024 406 371 406

Patched:

64 13147 12554 14787
128 7850 9788 9631
256 4472 5599 5638
1024 1218 1503 1309

I also tried with a version where the new tuple projections are built
in ExecInitModifyTable() as opposed to lazily:

64 10937 9969 8535
128 6586 5903 4887
256 3613 3118 2654
1024 884 749 652

This tells us that delaying initializing new tuple projection for
updates can have a sizable speedup and better scalability.

--
Amit Langote
EDB: http://www.enterprisedb.com

Attachment Content-Type Size
v16-0001-Initialize-result-relation-information-lazily.patch application/octet-stream 60.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2021-04-05 13:49:25 Re: Logical Replication - improve error message while adding tables to the publication in check_publication_add_relation
Previous Message Euler Taveira 2021-04-05 13:41:22 Re: use AV worker items infrastructure for GIN pending list's cleanup