Re: [WiP] B-tree page merge during vacuum to reduce index bloat

From: Darafei "Komяpa" Praliaskouski <me(at)komzpa(dot)net>
To: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, boekewurm+postgres(at)gmail(dot)com, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Kirk Wolak <wolakk(at)gmail(dot)com>, Nikolay Samokhvalov <nik(at)postgres(dot)ai>
Subject: Re: [WiP] B-tree page merge during vacuum to reduce index bloat
Date: 2025-11-10 19:16:52
Message-ID: CAC8Q8tJt6LFNaMjKAB0-SBm8q8p2ABQ47beEAgeFDFHKrUXQZg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

On Sun, Aug 31, 2025 at 4:16 PM Andrey Borodin <x4mmm(at)yandex-team(dot)ru> wrote:

>
>
> > On 29 Aug 2025, at 13:39, Andrey Borodin <x4mmm(at)yandex-team(dot)ru> wrote:
> >
>
> What if we just abort a scan, that stepped on the page where tuples were
> moved out?
>
...

> What do you think?
>

We have a database on which we have bulk insertions and deletions of
significant parts of the table.

btree- and gist-bloat becomes a significant issue there so much that we
have to resort to making ad-hoc cron-like solutions[1]. REINDEX
CONCURRENTLY also sometimes crashes due to memory pressure leaving
half-dead indexes behind which we have to clean up and keep reindexing
until success. [2]

Anything that improves the situation and makes Postgres handle this
automatically would improve the experience significantly.

Regarding locks: I think that baseline to compare to here is "what would
happen if I had to REINDEX instead" and that is EXCLUSIVE LOCK at some
point. I'd set that as a baseline for the endeavour. I think it may
dramatically simplify correctness checks for the first iterations and
relieve the pain for most of the cases.

A similar mechanic for GiST will also be helpful.

1.
https://github.com/konturio/insights-db/blob/main/scripts/reindex-bloated-btrees.sh
2.
https://github.com/konturio/insights-db/blob/main/scripts/drop_invalid_indexes.sql

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Banck 2025-11-10 19:33:06 Re: GNU/Hurd portability patches
Previous Message Jonathan S. Katz 2025-11-10 19:01:14 2025-11-13 release announcement draft