Re: making update/delete of inheritance trees scale better

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: making update/delete of inheritance trees scale better
Date: 2020-10-04 02:44:03
Message-ID: CA+HiwqHUkwcy84uFfUA3qVsyU2pgTwxVkJx1uwPQFSHfPz4rsA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 11, 2020 at 7:20 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> Here are the commit messages of the attached patches:
>
> [PATCH v3 1/3] Overhaul how updates compute a new tuple
>
> Currently, the planner rewrites the top-level targetlist of an update
> statement's parsetree so that it contains entries for all attributes
> of the target relation, including for those columns that have not
> been changed. This arrangement means that the executor can take a
> tuple that the plan produces, remove any junk attributes in it and
> pass it down to the table AM or FDW update API as the new tuple.
> It also means that in an inherited update, where there are multiple
> target relations, the planner must produce that many plans, because
> the targetlists for different target relations may not all look the
> same considering that child relations may have different sets of
> columns with varying attribute numbers.
>
> This commit revises things so that the planner no longer expands
> the parsetree targetlist to include unchanged columns so that the
> plan only produces values of the changed columns. To make the new
> tuple to pass to table AM and FDW update API, executor now evaluates
> another targetlist matching the target table's TupleDesc which refers
> to the plan's output tuple to gets values of the changed columns and
> to the old tuple that is refetched for values of unchanged columns.
>
> To get values for unchanged columns to use when forming the new tuple
> to pass to ExecForeignUpdate(), we now require foreign scans to
> always include the wholerow Var corresponding to the old tuple being
> updated, because the unchanged columns are not present in the
> plan's targetlist.
>
> As a note to FDW authors, any FDW update planning APIs that look at
> the plan's targetlist for checking if it is pushable to remote side
> (e.g. PlanDirectModify) should now instead look at "update targetlist"
> that is set by the planner in PlannerInfo.update_tlist, because resnos
> in the plan's targetlist is no longer indexable by target column's
> attribute numbers.
>
> Note that even though the main goal of doing this is to avoid having
> to make multiple plans in the inherited update case, this commit does
> not touch that subject. A subsequent commit will change things that
> are necessary to make inherited updates work with a single plan.

I tried to assess the performance impact of this rejiggering of how
updates are performed. As to why one may think there may be a
negative impact, consider that ExecModifyTable() now has to perform an
extra fetch of the tuple being updated for filling in the unchanged
values of the update's NEW tuple, because the plan itself will only
produce the values of changed columns.

* Setup: a 10 column target table with a millions rows

create table test_update_10 (
a int,
b int default NULL,
c int default 0,
d text default 'ddd',
e text default 'eee',
f text default 'fff',
g text default 'ggg',
h text default 'hhh',
i text default 'iii',
j text default 'jjj'
);
insert into test_update_1o (a) select generate_series(1, 1000000);

* pgbench test script (test_update_10.sql):

\set a random(1, 1000000)
update test_update_10 set b = :a where a = :a;

* TPS of `pgbench -n -T 120 -f test_update_10.sql`

HEAD:

tps = 10964.391120 (excluding connections establishing)
tps = 12142.456638 (excluding connections establishing)
tps = 11746.345270 (excluding connections establishing)
tps = 11959.602001 (excluding connections establishing)
tps = 12267.249378 (excluding connections establishing)

median: 11959.60

Patched:

tps = 11565.916170 (excluding connections establishing)
tps = 11952.491663 (excluding connections establishing)
tps = 11959.789308 (excluding connections establishing)
tps = 11699.611281 (excluding connections establishing)
tps = 11799.220930 (excluding connections establishing)

median: 11799.22

There is a slight impact but the difference seems within margin of error.

On the more optimistic side, I imagined that the trimming down of the
plan's targetlist to include only changed columns would boost
performance, especially with tables containing more columns, which is
not uncommon. With 20 columns (additional columns are all filler ones
as shown in the 10-column example), the same benchmarks gives the
following numbers:

HEAD:

tps = 11401.691219 (excluding connections establishing)
tps = 11620.855088 (excluding connections establishing)
tps = 11285.469430 (excluding connections establishing)
tps = 10991.890904 (excluding connections establishing)
tps = 10847.433093 (excluding connections establishing)

median: 11285.46

Patched:

tps = 10958.443325 (excluding connections establishing)
tps = 11613.783817 (excluding connections establishing)
tps = 10940.129336 (excluding connections establishing)
tps = 10717.405272 (excluding connections establishing)
tps = 11691.330537 (excluding connections establishing)

median: 10958.44

Hmm, not so much.

With 40 columns:

HEAD:

tps = 9778.362149 (excluding connections establishing)
tps = 10004.792176 (excluding connections establishing)
tps = 9473.849373 (excluding connections establishing)
tps = 9776.931393 (excluding connections establishing)
tps = 9737.891870 (excluding connections establishing)

median: 9776.93

Patched:

tps = 10709.949043 (excluding connections establishing)
tps = 10754.160718 (excluding connections establishing)
tps = 10175.841480 (excluding connections establishing)
tps = 9973.729774 (excluding connections establishing)
tps = 10467.109679 (excluding connections establishing)

median: 10467.10

There you go.

Perhaps, the plan's bigger target list with HEAD does not cause a
significant overhead in the *simple* update like above, because most
of the work during execution is of fetching the tuple to update and of
actually updating it. So, I also checked with a slightly more
complicated query containing a join:

\set a random(1, 1000000)
update test_update_10 t set b = foo.b from foo where t.a = foo.a and foo.b = :a;

where `foo` is defined as:

create table foo (a int, b int);
insert into foo select generate_series(1, 1000000);
create index on foo (b);

Looking at the EXPLAIN output of the query, one can see that the
target list is smaller after patching which can save some work:

HEAD:

explain (costs off, verbose) update test_update_10 t set b = foo.b
from foo where t.a = foo.a and foo.b = 1;
QUERY PLAN
--------------------------------------------------------------------------------------
Update on public.test_update_10 t
-> Nested Loop
Output: t.a, foo.b, t.c, t.d, t.e, t.f, t.g, t.h, t.i, t.j,
t.ctid, foo.ctid
-> Index Scan using foo_b_idx on public.foo
Output: foo.b, foo.ctid, foo.a
Index Cond: (foo.b = 1)
-> Index Scan using test_update_10_a_idx on public.test_update_10 t
Output: t.a, t.c, t.d, t.e, t.f, t.g, t.h, t.i, t.j, t.ctid
Index Cond: (t.a = foo.a)
(9 rows)

Patched:

explain (costs off, verbose) update test_update_10 t set b = foo.b
from foo where t.a = foo.a and foo.b = 1;
QUERY PLAN
------------------------------------------------------------------------------
Update on public.test_update_10 t
-> Nested Loop
Output: foo.b, t.ctid, foo.ctid
-> Index Scan using foo_b_idx on public.foo
Output: foo.b, foo.ctid, foo.a
Index Cond: (foo.b = 1)
-> Index Scan using test_update_10_a_idx on public.test_update_10 t
Output: t.ctid, t.a
Index Cond: (t.a = foo.a)
(9 rows)

And here are the TPS numbers for that query with 10, 20, 40 columns
table cases. Note that the more columns the target table has, the
bigger the target list to compute is with HEAD.

10 columns:

HEAD:

tps = 7594.881268 (excluding connections establishing)
tps = 7660.451217 (excluding connections establishing)
tps = 7598.899951 (excluding connections establishing)
tps = 7413.397046 (excluding connections establishing)
tps = 7484.978635 (excluding connections establishing)

median: 7594.88

Patched:

tps = 7402.409104 (excluding connections establishing)
tps = 7532.776214 (excluding connections establishing)
tps = 7549.397016 (excluding connections establishing)
tps = 7512.321466 (excluding connections establishing)
tps = 7448.255418 (excluding connections establishing)

median: 7512.32

20 columns:

HEAD:

tps = 6842.674366 (excluding connections establishing)
tps = 7151.724481 (excluding connections establishing)
tps = 7093.727976 (excluding connections establishing)
tps = 7072.273547 (excluding connections establishing)
tps = 7040.350004 (excluding connections establishing)

median: 7093.72

Patched:

tps = 7362.941398 (excluding connections establishing)
tps = 7106.826433 (excluding connections establishing)
tps = 7353.507317 (excluding connections establishing)
tps = 7361.944770 (excluding connections establishing)
tps = 7072.027684 (excluding connections establishing)

median: 7353.50

40 columns:

HEAD:

tps = 6396.845818 (excluding connections establishing)
tps = 6383.105593 (excluding connections establishing)
tps = 6370.143763 (excluding connections establishing)
tps = 6370.455213 (excluding connections establishing)
tps = 6380.993666 (excluding connections establishing)

median: 6380.99

Patched:

tps = 7091.581813 (excluding connections establishing)
tps = 7036.805326 (excluding connections establishing)
tps = 7019.120007 (excluding connections establishing)
tps = 7025.704379 (excluding connections establishing)
tps = 6848.846667 (excluding connections establishing)

median: 7025.70

It seems clear that the saving on the target list computation overhead
that we get from the patch is hard to ignore in this case.

I've attached updated patches, because as Michael pointed out, the
previous version no longer applies.

--
Amit Langote
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
v4-0001-Overhaul-how-updates-compute-a-new-tuple.patch application/octet-stream 75.3 KB
v4-0002-Include-result-relation-index-if-any-in-ForeignSc.patch application/octet-stream 13.4 KB
v4-0003-Revise-how-inherited-update-delete-are-handled.patch application/octet-stream 200.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andy Fan 2020-10-04 07:10:54 Re: [HACKERS] Runtime Partition Pruning
Previous Message James Coleman 2020-10-04 02:10:52 Re: enable_incremental_sort changes query behavior