Re: Teaching users how they can get the most out of HOT in Postgres 14

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Teaching users how they can get the most out of HOT in Postgres 14
Date: 2021-05-14 02:56:05
Message-ID: CAH2-WzkNbJ7k=FRanW1Ubt2hxqFcnbdqfASJyDTYmN0KZ=Yf1g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, May 13, 2021 at 7:14 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> Perhaps that's an awful deal, but based on which facts can you really
> say that this new behavior of needing at least 2% of relation pages
> with some dead items to clean up indexes is not a worse deal in some
> cases?

If I thought that it simply wasn't possible then I wouldn't have
accepted the need to make it possible to disable. This is a
cost/benefit decision problem, which must be made based on imperfect
information -- there are no absolute certainties. But I'm certain
about one thing: there is a large practical difference between the
optimization causing terrible performance in certain scenarios and the
optimization causing slightly suboptimal performance in certain
scenarios. A tiny risk of the former scenario is *much* worse than a
relatively large risk of the latter scenario. There needs to be a
sense of proportion about risk.

> This may cause more problems for the in-core index AMs, as
> much as it could impact any out-of-core index AM, no?

I don't understand what you mean here.

> What about
> other values like 1%, or even 5%? My guess is that there would be an
> ask to have more control on that, though that stands as my opinion.

How did you arrive at that guess? Why do you believe that? This is the
second time I've asked.

> Saying that, as long as there is a way to disable that for the users
> with autovacuum and manual vacuums, I'd be fine. It is worth noting
> that adding an GUC to control this optimization would make the code
> more confusing, as there is already do_index_cleanup, a
> vacuum_index_cleanup reloption, and specifying vacuum_index_cleanup to
> TRUE may cause the index cleanup to not actually kick if the 2% bar is
> not reached.

I don't intend to add a GUC. A reloption should suffice.

Your interpretation of what specifying vacuum_index_cleanup (the
VACUUM command option) represents doesn't seem particularly justified
to me. To me it just means "index cleanup and vacuuming are not
explicitly disabled, the default behavior". It's an option largely
intended for emergencies, and largely superseded by the failsafe
mechanism. This interpretation is justified by well established
precedent: it has long been possible for VACUUM to skip heap page
pruning and even heap page vacuuming just because a super-exclusive
lock could not be acquired (though the latter case no longer happens
due to the same work inside vacuumlazy.c) -- which also implies
skipping some index vacuuming, without it ever being apparent to the
user.

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2021-05-14 03:20:00 Re: compute_query_id and pg_stat_statements
Previous Message Pengchengliu 2021-05-14 02:24:42 RE: Parallel scan with SubTransGetTopmostTransaction assert coredump