Re: Deleting older versions in unique indexes to avoid page splits

From: Victor Yegorov <vyegorov(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Deleting older versions in unique indexes to avoid page splits
Date: 2020-11-25 12:43:10
Message-ID: CAGnEbohMXdMKVLaU_SwKrZM1q4RGO4_-U2Ati6-8FvwUTFkkcQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

ср, 25 нояб. 2020 г. в 05:35, Peter Geoghegan <pg(at)bowt(dot)ie>:

> Then I had a much better idea: Make the existing LP_DEAD stuff a
> little more like bottom-up index deletion. We usually have to access
> heap blocks that the index tuples point to today, in order to have a
> latestRemovedXid cutoff (to generate recovery conflicts). It's worth
> scanning the leaf page for index tuples with TIDs whose heap block
> matches the index tuples that actually have their LP_DEAD bits set.
> This only consumes a few more CPU cycles. We don't have to access any
> more heap blocks to try these extra TIDs, so it seems like a very good
> idea to try them out.
>

I don't seem to understand this.

Is it: we're scanning the leaf page for all LP_DEAD tuples that point to
the same
heap block? Which heap block we're talking about here, the one that holds
entry we're about to add (the one that triggered bottom-up-deletion due to
lack
of space I mean)?

I ran the regression tests with an enhanced version of the patch, with
> this LP_DEAD-deletion-with-extra-TIDs thing. It also had custom
> instrumentation that showed exactly what happens in each case. We
> manage to delete at least a small number of extra index tuples in
> almost all cases -- so we get some benefit in practically all cases.
> And in the majority of cases we can delete significantly more. It's
> not uncommon to increase the number of index tuples deleted. It could
> go from 1 - 10 or so without the enhancement to LP_DEAD deletion, to
> 50 - 250 with the LP_DEAD enhancement. Some individual LP_DEAD
> deletion calls can free more than 50% of the space on the leaf page.
>

I am missing a general perspective here.

Is it true, that despite the long (vacuum preventing) transaction we can
re-use space,
as after the DELETE statements commits, IndexScans are setting LP_DEAD
hints after
they check the state of the corresponding heap tuple?

If my thinking is correct for both cases — nature of LP_DEAD hint bits and
the mechanics of
suggested optimization — then I consider this a very promising improvement!

I haven't done any testing so far since sending my last e-mail.
If you'll have a chance to send a new v10 version with
LP_DEAD-deletion-with-extra-TIDs thing,
I will do some tests (planned).

--
Victor Yegorov

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2020-11-25 12:50:20 Re: Transactions involving multiple postgres foreign servers, take 2
Previous Message Konstantin Knizhnik 2020-11-25 12:16:05 Re: Implementing Incremental View Maintenance