Re: New IndexAM API controlling index vacuum strategies

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New IndexAM API controlling index vacuum strategies
Date: 2021-03-02 03:00:41
Message-ID: CAH2-WznzLSAQAje4PApqXP4Nr-HQKzz9CUTUHo16P8xQ9gpWpA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Feb 21, 2021 at 10:28 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> Sorry for the late response.

Me too!

> 1. Whereas skipping index vacuum and heap vacuum is a very attractive
> improvement, if we skip that by default I wonder if we need a way to
> disable it. Vacuum plays a role in cleaning and diagnosing tables in
> practice. So in a case where the table is bad state and the user wants
> to clean all heap pages, it would be good to have a way to disable
> this skipping behavior. One solution would be that index_cleanup
> option has three different behaviors: on, auto (or smart), and off. We
> enable this skipping behavior by default in ‘auto’ mode, but
> specifying "INDEX_CLEANUP true” means to enforce index vacuum and
> therefore disabling it.

Sounds reasonable to me. Maybe users should express the skipping
behavior that they desire in terms of the *proportion* of all heap
blocks with LP_DEAD line pointers that we're willing to have while
still skipping index vacuuming + lazy_vacuum_heap() heap scan. In
other words, it can be a "scale" type GUC/param (though based on heap
blocks *at the end* of the first heap scan, not tuples at the point
the av launcher considers launching AV workers).

> @@ -1299,6 +1303,7 @@ lazy_scan_heap(Relation onerel, VacuumParams
> *params, LVRelStats *vacrelstats,
> {
> lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
> all_visible = false;
> + has_dead_tuples = true;
> continue;
> }
>
> I added the above change in the patch to count the number of heap
> pages having at least one LP_DEAD line pointer. But it's weird to me
> that we have never set has_dead_tuples true when we found an LP_DEAD
> line pointer.

I think that you're right. However, in practice it isn't harmful
because has_dead_tuples is only used when "all_visible = true", and
only to detect corruption (which should never happen). I think that it
should be fixed as part of this work, though.

Lots of stuff in this area is kind of weird already. Sometimes this is
directly exposed to users, even. This came up recently, when I was
working on VACUUM VERBOSE stuff. (This touched the precise piece of
code you've patched in the quoted diff snippet, so perhaps you know
some of the story I will tell you now already.)

I recently noticed that VACUUM VERBOSE can report a very low
tups_vacuumed/"removable heap tuples" when run against tables where
most pruning is opportunistic pruning rather than VACUUM pruning
(which is very common), provided there are no HOT updates (which is
common but not very common). This can be very confusing, because
VACUUM VERBOSE will report a "tuples_deleted" for the heap relation
that is far far less than the "tuples_removed" it reports for indexes
on the same table -- even though both fields have values that are
technically accurate (e.g., not very affected by concurrent activity
during VACUUM, nothing like that).

This came to my attention when I was running BenchmarkSQL for the
64-bit XID deleted pages patch. One of the BenchmarkSQL tables (though
only one -- the table whose UPDATEs are not HOT safe, which is unique
among BenchmarkSQL/TPC-C tables). I pushed a commit with comment
changes [1] to make that aspect of VACUUM VERBOSE a little less
confusing. (I was actually running a quick-and-dirty hack that made
log_autovacuum show VACUUM VERBOSE index stuff -- I would probably
have missed the weird difference between heap tups_vacuumed and index
tuples_removed without this custom log_autovacuum hack.)

Just to be clear (I think you agree already): we should base any
triggering logic for skipping index vacuuming/lazy_vacuum_heap() on
logic that does not care *when* heap pages first contained LP_DEAD
line pointers (could be that they were counted in tups_vacuumed due to
being pruned during this VACUUM operation, could be from an earlier
opportunistic pruning, etc).

> Currently, we set it to false true in 'tupgone' case but
> it seems to me that we should do that in this case as well since we
> use this flag in the following check:
>
> else if (PageIsAllVisible(page) && has_dead_tuples)
> {
> elog(WARNING, "page containing dead tuples is marked as
> all-visible in relation \"%s\" page %u",
> vacrelstats->relname, blkno);
> PageClearAllVisible(page);
> MarkBufferDirty(buf);
> visibilitymap_clear(onerel, blkno, vmbuffer,
> VISIBILITYMAP_VALID_BITS);
> }

The "tupgone = true"/HEAPTUPLE_DEAD-race case is *extremely* weird. It
has zero test coverage according to coverage.postgresql.org [2],
despite being very complicated.

3 points on the "tupgone = true" weirdness (I'm writing this as a
record for myself, almost):

1. It is the reason why lazy_vacuum_heap() must be prepared to set
tuples LP_UNUSED that are not already LP_DEAD. So when
lazy_vacuum_page() says "the first dead tuple for this page", that
doesn't necessarily mean LP_DEAD items! (Though the other cases are
not even tested, I think -- the lack of "tupgone = true" test coverage
also means we don't cover corresponding lazy_vacuum_page() cases.)

2. This is also why we need XLOG_HEAP2_CLEANUP_INFO records (i.e. why
XLOG_HEAP2_CLEAN records are not sufficient to all required recovery
conflicts during VACUUM).

3. And it's also why log_heap_clean() is needed for both
lazy_scan_heap()'s pruning and lazy_vacuum_heap() unused-marking.

Many years ago, Noah Misch tried to clean this up -- that included
renaming lazy_vacuum_heap() to lazy_heap_clear_dead_items(), which
would only deal with LP_DEAD items:

https://www.postgresql.org/message-id/flat/20130108024957.GA4751%40tornado.leadboat.com

Of course, this effort to eliminate the "tupgone =
true"/XLOG_HEAP2_CLEANUP_INFO special case didn't go anywhere at the
time.

[1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=7cde6b13a9b630e2f04d91e2f17dedc2afee21c6
[2] https://coverage.postgresql.org/src/backend/access/heap/vacuumlazy.c.gcov.html
--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2021-03-02 03:10:45 Re: archive_command / pg_stat_archiver & documentation
Previous Message miyake_kouta 2021-03-02 02:57:37 [PATCH] psql : Improve code for help option