Re: New vacuum option to do only freezing

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, "Bossart, Nathan" <bossartn(at)amazon(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New vacuum option to do only freezing
Date: 2019-05-02 15:09:10
Message-ID: CA+TgmoZTXfTe8AtLhYMXOGwEF4nMVoZNPsku1Vbg120L1vQRgw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Apr 16, 2019 at 4:00 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I wrote:
> > I'm thinking that we really need to upgrade vacuum's reporting totals
> > so that it accounts in some more-honest way for pre-existing dead
> > line pointers. The patch as it stands has made the reporting even more
> > confusing, rather than less so.
>
> Here's a couple of ideas about that:
>
> 1. Ignore heap_page_prune's activity altogether, on the grounds that
> it's just random chance that any cleanup done there was done during
> VACUUM and not some preceding DML operation. Make tups_vacuumed
> be the count of dead line pointers removed. The advantage of this
> way is that tups_vacuumed would become independent of previous
> non-VACUUM pruning activity, as well as preceding index-cleanup-disabled
> VACUUMs. But maybe it's hiding too much information.
>
> 2. Have heap_page_prune count, and add to tups_vacuumed, only HOT
> tuples that it deleted entirely. The action of replacing a DEAD
> root tuple with a dead line pointer doesn't count for anything.
> Then also add the count of dead line pointers removed to tups_vacuumed.
>
> Approach #2 is closer to the way we've defined tups_vacuumed up to
> now, but I think it'd be more realistic in cases where previous
> pruning or index-cleanup-disabled vacuums have left lots of dead
> line pointers.
>
> I'm not especially wedded to either of these --- any other ideas?

I think it's almost impossible to have clear reporting here with only
a single counter. There are two clearly-distinct cleanup operations
going on here: (1) removing tuples from pages, and (2) making dead
line pointers unused so that they can be reused for new tuples. They
happen in equal quantity when there are no HOT updates: pruning makes
dead tuples into dead line pointers, and then index vacuuming allows
the dead line pointers to be set unused. But if there are HOT
updates, intermediate tuples in each HOT chain are removed from the
page but the line pointers are directly set to unused, so VACUUM could
remove a lot of tuples but not need to make very many dead line
pointers unused. On the other hand, the opposite could also happen:
maybe lots of single-page HOT-pruning has happened prior to VACUUM, so
VACUUM has lots of dead line pointers to make unused but removes very
few tuples because that's already been done.

For the most part, tups_vacuumed seems to be intending to count #1
rather than #2. While the comments for heap_page_prune and
heap_prune_chain are not as clear as might be desirable, it appears to
me that those functions are counting tuples removed from a page,
ignoring everything that might happen to line pointers -- so using the
return value of this function is consistent with wanting to count #1.
However, there's one place that seems slightly unclear about this,
namely this comment:

/*
* DEAD item pointers are to be vacuumed normally; but we don't
* count them in tups_vacuumed, else we'd be double-counting (at
* least in the common case where heap_page_prune() just freed up
* a non-HOT tuple).
*/

If we're counting tuples removed from pages, then it's not merely that
we would be double-counting, but that we would be counting completely
the wrong thing. However, as far as I can see, that's just an issue
with the comments; the code is in all cases counting tuple removals,
not dead line pointers marked unused.

If I understand correctly, your first proposal amounts to redefining
tups_vacuumed to count #2 rather than #1, and your second proposal
amounts to making tups_vacuumed count #1 + #2 rather than #1. I
suggest a third option: have two counters. tups_vacuum can continue
to count #1, with just a comment adjustment. And then we can add a
second counter which is incremented every time lazy_vacuum_page does
ItemIdSetUnused, which will count #2.

Thoughts?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2019-05-02 15:31:07 Re: REINDEX INDEX results in a crash for an index of pg_class since 9.6
Previous Message Tom Lane 2019-05-02 15:08:43 Re: REINDEX INDEX results in a crash for an index of pg_class since 9.6