Re: vacuum verbose detail logs are unclear; log at *start* of each stage; show allvisible/frozen/hintbits

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: vacuum verbose detail logs are unclear; log at *start* of each stage; show allvisible/frozen/hintbits
Date: 2020-01-13 00:45:43
Message-ID: 20200113004542.GA26045@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Dec 29, 2019 at 01:15:24PM -0500, Jeff Janes wrote:
> On Fri, Dec 20, 2019 at 12:11 PM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
>
> > This is a usability complaint. If one knows enough about vacuum and/or
> > logging, I'm sure there's no issue.
>
> > | 11 DEBUG: "t": found 999 removable, 999 nonremovable row versions in 9 out of 9 pages
>
> I agree the mixture of pre-action and after-action reporting is rather
> confusing sometimes. I'm more concerned about what the user sees in their
> terminal, though, rather than the server's log file.

Sorry, I ran vacuum (not verbose) with client_min_messages=debug, which was confusing.

> Also, the above quoted line is confusing. It makes it sound like it found
> removable items, but didn't actually remove them. I think that that is
> taking grammatical parallelism too far. How about something like:
>
> DEBUG: "t": removed 999 row versions, found 999 nonremovable row versions in 9 out of 9 pages

Since da4ed8bf, lazy_vacuum_heap() actually says: "removed %d [row versions] in
%d pages". Strangely, the "found .. removable, .. nonremovable" in
lazy_scan_heap() is also from da4ed8bf. Should we change them to match ?

> Also, I'd appreciate a report on how many hint-bits were set
> and how many pages were marked all-visible and/or frozen.

Possibly should fork this part to a different thread, but..
hint bits are being set by heap_prune_chain():

|#0 HeapTupleSatisfiesVacuum (htup=htup(at)entry=0x7fffabfcccc0, OldestXmin=OldestXmin(at)entry=536, buffer=buffer(at)entry=167) at heapam_visibility.c:1245
|#1 0x00007fb6eb3eb848 in heap_prune_chain (prstate=0x7fffabfccf30, OldestXmin=536, rootoffnum=1, buffer=167, relation=0x7fb6eb1e6858) at pruneheap.c:488
|#2 heap_page_prune (relation=relation(at)entry=0x7fb6eb1e6858, buffer=buffer(at)entry=167, OldestXmin=536, report_stats=report_stats(at)entry=false, latestRemovedXid=latestRemovedXid(at)entry=0x7fb6ed84a13c) at pruneheap.c:223
|#3 0x00007fb6eb3f02a2 in lazy_scan_heap (aggressive=false, nindexes=0, Irel=0x0, vacrelstats=0x7fb6ed84a0c0, params=0x7fffabfcdfd0, onerel=0x7fb6eb1e6858) at vacuumlazy.c:970
|#4 heap_vacuum_rel (onerel=0x7fb6eb1e6858, params=0x7fffabfcdfd0, bstrategy=<optimized out>) at vacuumlazy.c:302

In the attached, I moved heap_page_prune to avoid a second loop over items.
Then, initdb crashed until I avoided calling heap_prepare_freeze_tuple() for
HEAPTUPLE_DEAD. I'm not sure that's ok or maybe if it's exposing an issue.
I'm also not sure if t_infomask!=oldt_infomask is the right test.

One of my usability complaints was that the DETAIL includes newlines, which
makes it not apparent that it's detail, or that it's associated with the
preceding INFO. Should those all be separate DETAIL messages (currently, only
the first errdetail is used, but maybe they should be catted together
usefully). Should errdetail do something with newlines, like change them to
\n\t for output to the client (but not logfile). Should vacuum itself do
something (but probably no change to logfiles).

I remembered that log_statement_stats looks like this:

2020-01-01 11:28:33.758 CST [3916] LOG: EXECUTOR STATISTICS
2020-01-01 11:28:33.758 CST [3916] DETAIL: ! system usage stats:
! 0.050185 s user, 0.000217 s system, 0.050555 s elapsed
! [2.292346 s user, 0.215656 s system total]
[...]

It calls errdetail_internal("%s", str.data), same as vaccum, but the multi-line
detail messages are written like this:
|appendStringInfo(&str, "!\t...")
|...
|ereport(LOG,
| (errmsg_internal("%s", title),
| errdetail_internal("%s", str.data)));

Since they can run multiple times, including rusage, and there's not currently
any message shown before their action, I propose that lazy_vacuum_index/heap
should write VACUUM VERBOSE logs at DEBUG level. Or otherwise show a log
before starting each action, at least those for which it logs completion.

I'm not sure why this one doesn't use get ngettext() ? Missed at a8d585c0 ?
|appendStringInfo(&buf, _("There were %.0f unused item identifiers.\n"),

Or why this one uses _/gettext() ? (580ddcec suggests that I'm missing
something?).
|appendStringInfo(&buf, _("%s."), pg_rusage_show(&ru0));

Anyway, now it looks like this:
postgres=# VACUUM VERBOSE t;
INFO: vacuuming "pg_temp_3.t"
INFO: "t": removed 1998 row versions in 5 pages
INFO: "t": removed 1998, found 999 nonremovable row versions in 9 out of 9 pages
DETAIL: ! 0 dead row versions cannot be removed yet, oldest xmin: 4505
! There were 0 unused item identifiers.
! Skipped 0 pages due to buffer pins, 0 frozen pages.
! 0 pages are entirely empty.
! Marked 9 pages all visible, 4 pages frozen.
! Wrote 1998 hint bits.
! CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
VACUUM

Thanks for your input.

Justin

Attachment Content-Type Size
v1-0001-Rename-buf-to-avoid-shadowing-buf-of-another-type.patch text/x-diff 2.7 KB
v1-0002-vacuumlazy-typos-in-comments.patch text/x-diff 1.4 KB
v1-0003-Indent-VACUUM-VERBOSE-multi-line-errdetail-messag.patch text/x-diff 3.7 KB
v1-0004-reduce-to-DEBUG-status-logged-from-vacuum_heap-in.patch text/x-diff 3.9 KB
v1-0005-Report-number-of-pages_allvisible-frozen.patch text/x-diff 4.4 KB
v1-0006-Report-number-of-hint-bits-written.patch text/x-diff 4.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2020-01-13 01:47:53 Re: bitmaps and correlation
Previous Message Tomas Vondra 2020-01-13 00:24:15 Re: Using multiple extended statistics for estimates