Adding an aminsert() hint that triggers bottom-up index deletion for UPDATEs that can't use HOT

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Adding an aminsert() hint that triggers bottom-up index deletion for UPDATEs that can't use HOT
Date: 2020-11-10 01:10:23
Message-ID: CAH2-WzmL+p-F1JOb7b6gJRhV-qiO_=aNETG43+V+7EPKxSKP5g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I am working on a project called "bottom-up index deletion" (I was
calling it "delete deduplication" until recently). This is another
nbtree project. The patch series adds a mechanism that controls old
duplicate versions caused by non-HOT UPDATEs. This is very effective.
In fact, it prevents almost all "unnecessary" page splits. By that I
mean it totally prevents page splits caused by version churn, where
old duplicate versions accumulate in indexes and cause page splits.
This is at least true in the absence of a long running xact/open
snapshot, though only provided the index isn't very small.

Importantly, the delete mechanism that the. patch series adds
naturally tends to bound the number of physical index tuple versions
for any given logical row represented in the index. I believe that
that's a big problem for us at present.

I'm starting a new thread to discuss issues with changing the
signature of the aminsert() index AM routine to add a hint for the
benefit of the patch. I would like to get some buy-in on the details.
But first some background:

The patch relies on the executor passing down a hint to nbtree that
indicates that the incoming index tuple is from an UPDATE that did not
modify any of the columns covered by the index. That's how the new
deletion mechanism avoids wasting cycles in cases that have no chance
of benefiting from the optimization, like plain INSERTs whose
aminsert() calls naturally insert non-transient index tuples that
point to whole new logical rows. We only trigger a relatively
expensive special round of deletion when we notice an accumulation of
versions on a leaf page, and are fairly confident (though not yet
completely sure) that that's what we see on the page. Hence
"bottom-up".

It's easy to review the executor mechanism in isolation, without
looking at the indexing stuff at all. It's broken out into its own
patch file in the latest version. The patch is called
v7-0002-Pass-down-logically-unchanged-index-hint.patch, and can be
downloaded here:

https://postgr.es/m/CAH2-WzmP5AymEfT_n3wAdvW8D7DduapHPqRzds5kv7VjnXsx6Q@mail.gmail.com

Some questions about the infrastructure I'm thinking of adding:

1. Is there any reason to believe that this will create noticeable
performance overhead elsewhere?

2. Is the current approach of adding a new boolean argument to
aminsert() comprehensive?

Examples of where I might have gone wrong with the current draft design:

Might I be missing an opportunity to add a more general mechanism that
will be useful in a variety of index access methods?

Maybe an enum would make more sense?

Or maybe I should add a new amupdate() routine? Or something else completely?

ISTM that ther general idea of giving index access methods hints about
what's going on with UPDATE chains is a good one. It isn't necessary
for index AMs to have a *reliable* understanding of exactly what an
UPDATE chain looks like -- that would be very brittle. But conveying
the general idea of the "lifecycle" of the data at the level of a leaf
page makes perfect sense.

To expand the discussion beyond the immediate needs of my patch: I
also think that it would make sense to "delete mark" index tuples with
a visibility hint (think of a new "probably going to be garbage before
too long" index tuple bit) when DELETE statements run. Again, this
doesn't have to be reliable in the same way that setting an LP_DEAD
bit does. This delete marking stuff is not on my agenda right now.
Just an example of another mechanism based on similar principles.

--
Peter Geoghegan

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2020-11-10 01:21:04 Re: pg_upgrade analyze script
Previous Message Kyotaro Horiguchi 2020-11-10 00:59:41 Re: Disable WAL logging to speed up data loading