Re: Unifying VACUUM VERBOSE and log_autovacuum_min_duration output

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Unifying VACUUM VERBOSE and log_autovacuum_min_duration output
Date: 2021-12-11 17:52:29
Message-ID: CAH2-WzmxdGDOcZo5DdDbSm5GauPC7bjd9o0scRp=9PsTtzyabA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Dec 10, 2021 at 8:30 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> I think some actually ended up being omitted compared to the previous
> state. E.g. "aggressively vacuuming ...", but I think others as well.

The "aggressive-ness" is reported by a distinct ereport() with the
patch, so you'll still see that information. You'll still be able to
see when each VACUUM begins and ends, which matters in database level
"VACUUM" command (a VACUUM that doesn't specify any relation, and so
vacuums everything). VACUUM VERBOSE should still work as a progress
indicator at the whole-VACUUM-operation level (when there will be more
than a single operation per command), but it won't indicate the
progress of any individual VACUUM operation anymore. That's the
trade-off.

To me the most notable loss of VERBOSE information is the number of
index tuples deleted in each index. But even that isn't so useful,
since you can already see the number of LP_DEAD items, which is a more
interesting number (that applies equally to all indexes, and the table
itself).

> > It's easy to produce examples where the patch is somewhat more verbose
> > than HEAD (that's what you see here).
>
> We could make verbose a more complicated parameter if that turns out to be a
> problem. E.g. controlling whether resource usage is included.

That's true, but I don't think that it's going to be a problem. I'd
rather avoid it if possible. If we need to place some of the stuff
that's currently only shown by VERBOSE to be shown by the autovacuum
log output too, then that's fine.

You said something about showing the number of workers launched in the
autovacuum log output (also the new VERBOSE output). That could make
sense. But there could be a different number of workers for cleanup
and for vacuuming. Should I show both together, or just the high
watermark? I think that it needs to be okay to suppress the output in
the common case where parallelism isn't used (e.g., in every
autovacuum).

> > It's also easy to produce examples where HEAD is *significantly* more
> > verbose than the patch. Especially when VERBOSE shows many lines of
> > getrusage() output (patch only ever shows one of those, at the end).
>
> That's not really equivalent though? It does seem somewhat useful to be able
> to distinguish the cost of heap and index processing?

I've personally never used VACUUM VERBOSE like that. I agree that it
could be useful, but I would argue that it's not worth it. I'd just
use the DEBUG1 version, or more likely use my own custom
microbenchmark.

> This is quite the nest of conditions by now. Any chance of cleaning that up?

Yes, I can simplify that code a little.

> > @@ -3279,12 +3214,10 @@ lazy_truncate_heap(LVRelState *vacrel)

> These imo are useful. Perhaps we could just make them part of some log
> message that autovac logging includes as well?

I would argue that it already does -- because you see pages removed
(which is heap pages truncation). We do lose the details with the
patch, of course -- you'll no longer see the progress of truncation,
which works incrementally. But as I said, that's the general trade-off
that the patch makes.

If you can't truncate the table due to a conflicting lock request,
then that might just have been for the last round of truncation. And
so reporting that aspect in the whole-autovacuum log output (or in the
new format VACUUM VERBOSE output) seems like it could be misleading.

I went as far as removing the getrusage stuff for the ereport()
messages that get demoted to DEBUG2. What do you think of that aspect?
I could add some the getrusage output back where that makes sense. I
don't have very strong feelings about that.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-12-11 19:12:06 Re: Dubious usage of TYPCATEGORY_STRING
Previous Message Zhihong Yu 2021-12-11 17:09:11 Re: Re: 回复:Re: Is it worth pushing conditions to sublink/subplan?