Re: making update/delete of inheritance trees scale better

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: making update/delete of inheritance trees scale better
Date: 2020-09-11 10:20:56
Message-ID: CA+HiwqFMw9G8SQMzjnPge2X3tdZF0uvkFTjrV--7_yQb6eEDSQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

I have been working away at this and have updated the patches for many
cosmetic and some functional improvements.

On Fri, Jun 12, 2020 at 3:46 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> I divided that into two patches:
>
> 1. Make the plan producing tuples to be updated emit only the columns
> that are actually updated. postgres_fdw test fails unless you also
> apply the patch I posted at [1], because there is an unrelated bug in
> UPDATE tuple routing code that manifests due to some changes of this
> patch.
>
> 2. Due to 1, inheritance_planner() is no longer needed, that is,
> inherited update/delete can be handled by pulling the rows to
> update/delete from only one plan, not one per child result relation.
> This one makes that so.
>
> There are some unsolved problems having to do with foreign tables in
> both 1 and 2:
>
> In 1, FDW update APIs still assume that the plan produces "full" tuple
> for update. That needs to be fixed so that FDWs deal with getting
> only the updated columns in the plan's output targetlist.
>
> In 2, still haven't figured out a way to call PlanDirectModify() on
> child foreign tables. Lacking that, inherited updates on foreign
> tables are now slower, because they are not pushed down. I'd like to
> figure something out to fix that situation.

In the updated patch, I have implemented a partial solution to this,
but I think it should be enough in most practically useful situations.
With the updated patch, PlanDirectModify is now called for child
result relations, but the FDWs will need to be revised to do useful
work in that call (as the patch does for postgres_fdw), because a
potentially pushable ForeignScan involving a given child result
relation will now be at the bottom of the source plan tree, whereas
before it would be the top-level plan. Another disadvantage of this
new situation is that inherited update/delete involving joins that
were previously pushable cannot be pushed anymore. If update/delete
would have been able to use partition-wise join, a child join
involving a given child result relation could in principle be pushed,
but some semi-related issues prevent the use of partition-wise joins
for update/delete, especially when there are foreign table partitions.

Another major change is that instead of "tableoid" junk attribute to
identify the target result relation for a given tuple to be
updated/deleted, the patch now makes the tuples to be updated/deleted
contain a junk attribute that gives the index of the result relation
in the query's list of result relations which can be used to look up
the target result relation directly. With "tableoid", we would need
to build a hash table to map the result relation OIDs to result
relation indexes, a step that could be seen to become a bottleneck
with large partition counts (I am talking about executing generic
plans here and have mentioned this problem on the thread to make
generic plan execution for update/delete faster [1]).

Here are the commit messages of the attached patches:

[PATCH v3 1/3] Overhaul how updates compute a new tuple

Currently, the planner rewrites the top-level targetlist of an update
statement's parsetree so that it contains entries for all attributes
of the target relation, including for those columns that have not
been changed. This arrangement means that the executor can take a
tuple that the plan produces, remove any junk attributes in it and
pass it down to the table AM or FDW update API as the new tuple.
It also means that in an inherited update, where there are multiple
target relations, the planner must produce that many plans, because
the targetlists for different target relations may not all look the
same considering that child relations may have different sets of
columns with varying attribute numbers.

This commit revises things so that the planner no longer expands
the parsetree targetlist to include unchanged columns so that the
plan only produces values of the changed columns. To make the new
tuple to pass to table AM and FDW update API, executor now evaluates
another targetlist matching the target table's TupleDesc which refers
to the plan's output tuple to gets values of the changed columns and
to the old tuple that is refetched for values of unchanged columns.

To get values for unchanged columns to use when forming the new tuple
to pass to ExecForeignUpdate(), we now require foreign scans to
always include the wholerow Var corresponding to the old tuple being
updated, because the unchanged columns are not present in the
plan's targetlist.

As a note to FDW authors, any FDW update planning APIs that look at
the plan's targetlist for checking if it is pushable to remote side
(e.g. PlanDirectModify) should now instead look at "update targetlist"
that is set by the planner in PlannerInfo.update_tlist, because resnos
in the plan's targetlist is no longer indexable by target column's
attribute numbers.

Note that even though the main goal of doing this is to avoid having
to make multiple plans in the inherited update case, this commit does
not touch that subject. A subsequent commit will change things that
are necessary to make inherited updates work with a single plan.

[PATCH v3 2/3] Include result relation index if any in ForeignScan

FDWs that can perform an UPDATE/DELETE remotely using the "direct
modify" set of APIs need in some cases to access the result relation
properties for which they can currently look at
EState.es_result_relation_info. However that means the executor must
ensure that es_result_relation_info points to the correct result
relation at all times, especially during inherited updates. This
requirement gets in the way of number of projects related to changing
how ModifyTable operates. For example, an upcoming patch will change
things such that there will be one source plan for all result
relations whereas currently there is one per result relation, an
arrangement which makes it convenient to switch the result relation
when the source plan changes.

This commit installs a new field 'resultRelIndex' in ForeignScan node
which must be set by an FDW if the node will be used to carry out an
UPDATE/DELETE operation on a given foreign table, which is the case
if the FDW manages to push that operations to the remote side. This
commit also modifies postgres_fdw to implement that.

[PATCH v3 3/3] Revise how inherited update/delete are handled

Now that we have the ability to maintain and evaluate the targetlist
needed to generate an update's new tuples independently of the plan
which fetches the tuples to be updated, there is no need to make
separate plans for child result relations as inheritance_planner()
currently does. We generated separate plans before such capability
was present, because that was the only way to generate new tuples of
child relations where each may have its own unique set of columns
(albeit all sharing the set columns present in the root parent).

With this commit, an inherited update/delete query will now be planned
just as a non-inherited one, generating a single plan that goes under
ModifyTable. The plan for the inherited case is essentially the one
that we get for a select query, although the targetlist additionally
contains junk attributes needed by update/delete.

By going from one plan per result relation to only one shared across
all result relations, the executor now needs a new way to identify the
result relation to direct a given tuple's update/delete to, whereas
before, it could tell that from the plan it is executing. To that
end, the planner now adds a new junk attribute to the query's
targetlist that for each tuple gives the index of the result relation
in the query's list of result relations. That is in addition to the
junk attribute that the planner already adds to identify the tuple's
position in a given relation (such as "ctid").

Given the way query planning with inherited tables work where child
relations are not part of the query's jointree and only the root
parent is, there are some challenges that arise in the update/delete
case:

* The junk attributes needed by child result relations need to be
represented as root parent Vars, which is a non-issue for a given
child if what the child needs and what is added for the root parent
are one and the same column. But considering that that may not
always be the case, more parent Vars might get added to the top-level
targetlist as children are added to the query as result relations.
In some cases, a child relation may use a column that is not present
in the parent (allowed by traditional inheritance) or a non-column
expression, which must be represented using what this patch calls
"fake" parent vars. These fake parent vars are really only
placeholders for the underlying child relation's column or expression
and don't reach the executor's expression evluation machinery.

* FDWs that are able to push update/delete fully to the remote side
using DirectModify set of APIs now have to go through hoops to
identify the subplan and the UPDATE targetlist to push for child
result relations, because the subplans for individual result
relations are no loger top-level plans. In fact, if the result
relation is joined to another relation, update/delete cannot be
pushed down at all anymore, whereas before since the child relations
would be present in the main jointree, they could be in the case
where the relation being joined to was present on the same server as
the child result relation.

--
Amit Langote
EnterpriseDB: http://www.enterprisedb.com

[1] https://www.postgresql.org/message-id/CA%2BHiwqG7ZruBmmih3wPsBZ4s0H2EhywrnXEduckY5Hr3fWzPWA%40mail.gmail.com

Attachment Content-Type Size
v3-0002-Include-result-relation-index-if-any-in-ForeignSc.patch application/octet-stream 13.4 KB
v3-0001-Overhaul-how-updates-compute-a-new-tuple.patch application/octet-stream 75.0 KB
v3-0003-Revise-how-inherited-update-delete-are-handled.patch application/octet-stream 202.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Konstantin Knizhnik 2020-09-11 10:42:17 Re: On login trigger: take three
Previous Message Tomas Vondra 2020-09-11 10:14:41 Re: WIP: BRIN multi-range indexes