Re: Deleting older versions in unique indexes to avoid page splits

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Victor Yegorov <vyegorov(at)gmail(dot)com>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Deleting older versions in unique indexes to avoid page splits
Date: 2021-01-11 00:06:54
Message-ID: CAH2-WzkHnpMqH1W_r=1g++ReHfn1PdS04AiCK7CeYOBiDCVR7w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 7, 2021 at 3:07 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> I agree. I'll use the name index_expression_changed_walker() in the
> next revision.

Attached is v13, which has this tweak, and other miscellaneous cleanup
based on review from both Victor and Heikki. I consider this version
of the patch to be committable. I intend to commit something close to
it in the next week, probably no later than Thursday. I still haven't
got to the bottom of the shellsort question raised by Heikki. I intend
to do further performance validation before committing the patch. I
will look into the shellsort thing again as part of this final
performance validation work -- perhaps I can get rid of the
specialized shellsort implementation entirely, simplifying the state
structs added to tableam.h. (As I said before, it seems best to
address this last of all to avoid making performance validation even
more complicated.)

This version of the patch is notable for removing the index storage
param, and for having lots of comment updates and documentation
consolidation, particularly in heapam.c. Many of the latter changes
are based on feedback from Heikki. Note that all of the discussion of
heapam level locality has been consolidated, and is now mostly
confined to a fairly large comment block over
bottomup_nblocksfavorable() in heapam.c. I also cut down on redundancy
among comments about the design at the whole-patch level. A small
amount of redundancy in design docs/comments is a good thing IMV. It
was hard to get the balance exactly right, since bottom-up index
deletion is by its very nature a mechanism that requires the index AM
and the tableam to closely cooperate -- which is a novel thing.

This isn't 100% housekeeping changes, though. I did add one new minor
optimization to v13: We now count the heap block of the incoming new
item index tuple's TID (the item that won't fit on the leaf page
as-is) as an LP_DEAD-related block for the purposes of determining
which heap blocks will be visited during simple index tuple deletion.
The extra cost of doing this is low: when the new item heap block is
visited purely due to this new behavior, we're still practically
guaranteed to not get a buffer miss to read from the heap page. The
reason should be obvious: the executor is currently in the process of
modifying that same heap page anyway. The benefits are also high
relative to the cost. This heap block in particular seems to be very
promising as a place to look for deletable TIDs (I tested this with
custom instrumentation and microbenchmarks). I believe that this
effect exists because by its very nature garbage is often concentrated
in recently modified pages. This is per the generational hypothesis,
an important part of the theory behind GC algorithms for automated
memory management (GC theory seems to have real practical relevance to
the GC/VACUUM problems in Postgres, at least at a high level).

Of course we still won't do any simple deletion operations unless
there is at least one index tuple with its LP_DEAD bit set in the
first place at the point that it looks like the page will overflow (no
change there). As always, we're just piggy-backing some extra work on
top of an expensive operation that needed to take place anyway. I
couldn't resist adding this new minor optimization at this late stage,
because it is such a bargain.

Thanks
--
Peter Geoghegan

Attachment Content-Type Size
v13-0001-Pass-down-logically-unchanged-index-hint.patch application/octet-stream 29.3 KB
v13-0002-Enhance-nbtree-index-tuple-deletion.patch application/octet-stream 143.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message tsunakawa.takay@fujitsu.com 2021-01-11 00:14:04 RE: Disable WAL logging to speed up data loading
Previous Message Tom Lane 2021-01-11 00:02:40 Re: Inconsistent "<acronym>" use