Re: Deleting older versions in unique indexes to avoid page splits

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Victor Yegorov <vyegorov(at)gmail(dot)com>
Subject: Re: Deleting older versions in unique indexes to avoid page splits
Date: 2020-11-09 17:20:50
Message-ID: CAH2-WzmP5AymEfT_n3wAdvW8D7DduapHPqRzds5kv7VjnXsx6Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 3, 2020 at 12:44 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> v6 still needs more polishing -- my focus has still been on the
> algorithm itself. But I think I'm almost done with that part -- it
> seems unlikely that I'll be able to make any additional significant
> improvements in that area after v6.

Attached is v7, which tidies everything up. The project is now broken
up into multiple patches, which can be committed separately. Every
patch has a descriptive commit message. This should make it a lot
easier to review.

I've renamed the feature to "bottom-up index deletion" in this latest
revision. This seems like a better name than "dedup deletion". This
name suggests that the feature complements "top-down index deletion"
by VACUUM. This name is descriptive of what the new mechanism is
supposed to do at a high level.

Other changes in v7 include:

* We now fully use the tableam interface -- see the first patch.

The bottom-up index deletion API has been fully worked out. There is
now an optional callback/shim function. The bottom-up index deletion
caller (nbtree) is decoupled from the callee (heapam) by the tableam
shim. This was what allowed me to break the patch into multiple
pieces/patches.

* The executor no longer uses a IndexUniqueCheck-enum-constant as a
hint to nbtree. Rather, we have a new aminsert() bool argument/flag
that hints to the index AM -- see the second patch.

To recap, the hint tells nbtree that the incoming tuple is a duplicate
of an existing tuple caused by an UPDATE, without any logical changes
for the indexed columns. Bottom-up deletion is effective when there is
a local concentration of these index tuples that become garbage
quickly.

A dedicated aminsert() argument seems a lot cleaner. Though I wonder
if this approach can be generalized a bit further, so that we can
support other similar aminsert() hints in the future without adding
even more arguments. Maybe some new enum instead of a boolean?

* Code cleanup for the nbtdedup.c changes. Better explanation of when
and how posting list TIDs are marked promising, and why.

* Streamlined handling of the various strategies that nbtinsert.c uses
to avoid a page split (e.g. traditional LP_DEAD deletion,
deduplication).

A new unified function in nbtinsert.c was added. This organization is
a lot cleaner -- it greatly simplifies _bt_findinsertloc(), which
became more complicated than it really needed to be due to the changes
needed for deduplication in PostgreSQL 13. This change almost seems
like an independently useful piece of work.

--
Peter Geoghegan

Attachment Content-Type Size
v7-0004-Teach-heapam-to-support-bottom-up-index-deletion.patch application/octet-stream 24.8 KB
v7-0002-Pass-down-logically-unchanged-index-hint.patch application/octet-stream 25.7 KB
v7-0003-Teach-nbtree-to-use-bottom-up-index-deletion.patch application/octet-stream 67.5 KB
v7-0001-Make-tableam-interface-support-bottom-up-deletion.patch application/octet-stream 7.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2020-11-09 17:36:20 Re: Disable WAL logging to speed up data loading
Previous Message David G. Johnston 2020-11-09 17:05:21 Re: Disable WAL logging to speed up data loading