Re: Deleting older versions in unique indexes to avoid page splits

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Victor Yegorov <vyegorov(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Deleting older versions in unique indexes to avoid page splits
Date: 2020-11-17 21:38:05
Message-ID: CAH2-WzmFsqz_dbFCv1+jhdA8C_e3JcnnR=wh0=HD3imY-MbyXQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 17, 2020 at 7:17 AM Victor Yegorov <vyegorov(at)gmail(dot)com> wrote:
> чт, 12 нояб. 2020 г. в 23:00, Peter Geoghegan <pg(at)bowt(dot)ie>:
>> Another thing that I'll probably add to v8: Prefetching. This is
>> probably necessary just so I can have parity with the existing
>> heapam.c function that the new code is based on,
>> heap_compute_xid_horizon_for_tuples(). That will probably help here,
>> too.
>
> I don't quite see this part. Do you mean top_block_groups_favorable() here?

I meant to add prefetching to the version of the patch that became v8,
but that didn't happen because I ran out of time. I wanted to get out
a version with the low cardinality fix, to see if that helped with the
regression you talked about last week. (Prefetching seems to make a
small difference when we're I/O bound, so it may not be that
important.)

Attached is v9 of the patch series. This actually has prefetching in
heapam.c. Prefetching is not just applied to favorable blocks, though
-- it's applied to all the blocks that we might visit, even though we
often won't really visit the last few blocks in line. This needs more
testing. The specific choices I made around prefetching were
definitely a bit arbitrary. To be honest, it was a bit of a
box-ticking thing (parity with similar code for its own sake). But
maybe I failed to consider particular scenarios in which prefetching
really is important.

My high level goal for v9 was to do cleanup of v8. There isn't very
much that you could call a new enhancement (just the prefetching
thing).

Other changes in v9 include:

* Much simpler approach to passing down an aminsert() hint from the
executor in v9-0002* patch.

Rather than exposing some HOT implementation details from
heap_update(), we use executor state that tracks updated columns. Now
all we have to do is tell ExecInsertIndexTuples() "this round of index
tuple inserts is for an UPDATE statement". It then figures out the
specific details (whether it passes the hint or not) on an index by
index basis. This interface feels much more natural to me.

This also made it easy to handle expression indexes sensibly. And, we
get support for the logical replication UPDATE caller to
ExecInsertIndexTuples(). It only has to say "this is for an UPDATE",
in the usual way, without any special effort (actually I need to test
logical replication, just to be sure, but I think that it works fine
in v9).

* New B-Tree sgml documentation in v9-0003* patch. I've added an
extensive user-facing description of the feature to the end of
"Chapter 64. B-Tree Indexes", near the existing discussion of
deduplication.

* New delete_items storage parameter. This makes it possible to
disable the optimization. Like deduplicate_items in Postgres 13, it is
not expected to be set to "off" very often.

I'm not yet 100% sure that a storage parameter is truly necessary -- I
might still change my mind and remove it later.

Thanks
--
Peter Geoghegan

Attachment Content-Type Size
v9-0001-Make-tableam-interface-support-bottom-up-deletion.patch application/octet-stream 7.8 KB
v9-0002-Pass-down-logically-unchanged-index-hint.patch application/octet-stream 28.6 KB
v9-0003-Teach-nbtree-to-use-bottom-up-index-deletion.patch application/octet-stream 68.6 KB
v9-0004-Teach-heapam-to-support-bottom-up-index-deletion.patch application/octet-stream 25.4 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2020-11-17 21:53:21 Re: proposal: possibility to read dumped table's name from file
Previous Message Robert Haas 2020-11-17 21:22:54 Re: Protect syscache from bloating with negative cache entries