Re: New vacuum option to do only freezing

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, "Bossart, Nathan" <bossartn(at)amazon(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New vacuum option to do only freezing
Date: 2019-03-06 18:55:14
Message-ID: CA+TgmoZJdLXJJ8HutxJ19h6pnDMcHXL6v1AaiODOOaWT644xHA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 5, 2019 at 11:29 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> Attached updated patch incorporated all of comments. Also I've added
> new reloption vacuum_index_cleanup as per discussion on the "reloption
> to prevent VACUUM from truncating empty pages at the end of relation"
> thread. Autovacuums also can skip index cleanup when the reloption is
> set to false. Since the setting this to false might lead some problems
> I've made autovacuums report the number of dead tuples and dead
> itemids we left.

It seems to me that the disable_index_cleanup should be renamed
index_cleanup and the default should be changed to true, for
consistency with the reloption (and, perhaps, other patches).

- num_tuples = live_tuples = tups_vacuumed = nkeep = nunused = 0;
+ num_tuples = live_tuples = tups_vacuumed = nkeep = nunused =
+ nleft_dead_itemids = nleft_dead_tuples = 0;

I would suggest leaving the existing line alone (and not adding an
extra space to it as the patch does now) and just adding a second
initialization on the next line as a separate statement. a = b = c = d
= e = 0 isn't such great coding style that we should stick to it
rigorously even when it ends up having to be broken across lines.

+ /* Index vacuum must be enabled in two-pass vacuum */
+ Assert(!skip_index_vacuum);

I am a big believer in naming consistency. Please, let's use the same
name everywhere! If it's going to be index_cleanup, then call the
reloption vacuum_index_cleanup, and call the option index_cleanup, and
call the variable index_cleanup. Picking a different subset of
cleanup, index, vacuum, skip, and disable for each new name makes it
harder to understand.

- * If there are no indexes then we can vacuum the page right now
- * instead of doing a second scan.
+ * If there are no indexes or index vacuum is disabled we can
+ * vacuum the page right now instead of doing a second scan.

This comment is wrong. That wouldn't be safe. And that's probably
why it's not what the code does.

- /* If no indexes, make log report that lazy_vacuum_heap would've made */
+ /*
+ * If no index or index vacuum is disabled, make log report that
+ * lazy_vacuum_heap would've make.
+ */
if (vacuumed_pages)

Hmm, does this really do what the comment claims? It looks to me like
we only increment vacuumed_pages when we call lazy_vacuum_page(), and
we (correctly) don't do that when index cleanup is disabled, but then
here this claims that if (vacuumed_pages) will be true in that case.

I wonder if it would be cleaner to rename vacrelstate->hasindex to
'useindex' and set it to false if there are no indexes or index
cleanup is disabled. But that might actually be worse, not sure.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2019-03-06 19:04:27 Re: performance issue in remove_from_unowned_list()
Previous Message Alvaro Herrera 2019-03-06 18:52:50 Re: performance issue in remove_from_unowned_list()