Re: making update/delete of inheritance trees scale better

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: making update/delete of inheritance trees scale better
Date: 2021-03-30 04:51:15
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Here's a v13 patchset that I feel pretty good about.

My original thought for replacing the "fake variable" design was to
add another RTE holding the extra variables, and then have setrefs.c
translate the placeholder variables to the real thing at the last
possible moment. I soon realized that instead of an actual RTE,
it'd be better to invent a special varno value akin to INDEX_VAR
(I called it ROWID_VAR, though I'm not wedded to that name). Info
about the associated variables is kept in a list of RowIdentityVarInfo
structs, which are more suitable than a regular RTE would be.

I got that and the translate-in-setrefs approach more or less working,
but it was fairly messy, because the need to know about these special
variables spilled into FDWs and a lot of other places; for example
indxpath.c needed a special check for them when deciding if an
index-only scan is possible. What turns out to be a lot cleaner is
to handle the translation in adjust_appendrel_attrs_mutator(), so that
we have converted to real variables by the time we reach any
relation-scan-level logic.

I did end up having to break the API for FDW AddForeignUpdateTargets
functions: they need to do things differently when adding junk columns,
and they need different parameters. This seems all to the good though,
because the old API has been a backwards-compatibility hack for some
time (e.g., in not passing the "root" pointer).

Some other random notes:

* I was unimpressed with the idea of distinguishing different target
relations by embedding integer constants in the plan. In the first
place, the implementation was extremely fragile --- there was
absolutely NOTHING tying the counter you used to the subplans' eventual
indexes in the ModifyTable lists. Plus I don't have a lot of faith
that setrefs.c will reliably do what you want in terms of bubbling the
things up. Maybe that could be made more robust, but the other problem
is that the EXPLAIN output is just about unreadable; nobody will
understand what "(0)" means. So I went back to the idea of emitting
tableoid, and installed a hashtable plus a one-entry lookup cache
to make the run-time mapping as fast as I could. I'm not necessarily
saying that this is how it has to be indefinitely, but I think we
need more work on planner and EXPLAIN infrastructure before we can
get the idea of directly providing a list index to work nicely.

* I didn't agree with your decision to remove the now-failing test
cases from postgres_fdw.sql. I think it's better to leave them there,
especially in the cases where we were checking the plan as well as
the execution. Hopefully we'll be able to un-break those soon.

* I updated a lot of hereby-obsoleted comments, which makes the patch
a bit longer than v12; but actually the code is a good bit smaller.
There's a noticeable net code savings in src/backend/optimizer/,
which there was not before.

I've not made any attempt to do performance testing on this,
but I think that's about the only thing standing between us
and committing this thing.

regards, tom lane

Attachment Content-Type Size
v13-0001-Overhaul-how-updates-compute-new-tuple.patch text/x-diff 85.7 KB
v13-0002-Revise-how-inherited-update-delete-are-handled.patch text/x-diff 227.2 KB

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2021-03-30 05:09:17 Re: [HACKERS] logical decoding of two-phase transactions
Previous Message Ajin Cherian 2021-03-30 04:48:30 Re: [PATCH] add concurrent_abort callback for output plugin