Re: Vacuum statistics

From: Kirill Reshke <reshkekirill(at)gmail(dot)com>
To: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
Cc: Jim Nasby <jnasby(at)upgrade(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, a(dot)lepikhov(at)postgrespro(dot)ru
Subject: Re: Vacuum statistics
Date: 2024-11-30 04:48:22
Message-ID: CALdSSPhv1MAd9GPyWcnMxxP2ZgXFpCksqe7onbkVzQwLaySqgw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 13 Nov 2024 at 21:21, Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>
> Hi! Thank you for your contribution to this thread!
>
> On 13.11.2024 03:24, Jim Nasby wrote:
>
> On Nov 10, 2024, at 2:09 PM, Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>
>
> On 08.11.2024 22:34, Jim Nasby wrote:
>
>
> On Nov 2, 2024, at 7:22 AM, Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>
> The second is the interrupts field. It is needed for monitoring to know
> do we have them or not, so tracking them on the database level will do
> the trick. Interrupt is quite rare event, so once the monitoring system
> will catch one the DBA can go to the server log for the details.
>
> Just to confirm… by “interrupt” you mean vacuum encountered an error?
>
> Yes it is.
>
> In that case I feel rather strongly that we should label that as “errors”. “Interrupt” could mean a few different things, but “error” is very clear.
>
> I updated patches. I excluded system and user time statistics and save number of interrupts only for database. I removed the ability to get statistics for all tables, now they can only be obtained for an oid table [0], as suggested here. I also renamed the statistics from pg_stat_vacuum_tables to pg_stat_get_vacuum_tables and similarly for indexes and databases. I noticed that that’s what they’re mostly called. Ready for discussion.
>
> I think it’s better that the views follow the existing naming conventions (which don’t include “_get_”; only the functions have that in their names). Assuming that, the only question becomes pg_stat_vacuum_* vs pg_stat_*_vacuum. Given the existing precedent of pg_statio_*, I’m inclined to go with pg_stat_vacuum_*.
>
> I have fixed it.
>
>
> I’ve reviewed and made some cosmetic changes to patch 1, though of note it looks like an effort has been made to keep stat_reset_timestamp at the end of PgStat_StatDBEntry, so I re-arranged that. I also removed some obviously dead code. It appears that pgstat_update_snapshot(), InitSnapshotIterator() and ScanStatSnapshot() are also dead, but I’ve left it in incase I’m missing something. The tests are also failing for me because a number of psql variables aren’t set.
>
> Thank you! Yes, I have deleted them.
>
>
> I do think we should separate out the counts for deleted but still visible tuples vs tuples where we couldn’t get a cleanup lock (in other words, recently_dead_tuples and missed_dead_tuples from LVRelState). I realize that’s a departure from how some of the existing reporting works, but IMO combining them together isn’t a pattern we should be repeating since they mean completely different things. Towards that end I did remove missed_dead_tuples from the reporting, and renamed ExtVacReport.dead_tuples to recently_dead_tuples, but I stopped short of creating a separate entry for missed_dead_tuples. Note that while recently_dead_tuples is really a global thing (so only needs to be reported at a global (or at most per-database) level, but missed_dead_tuples should really be at a per-table level.
>
> I am willing to agree with your idea. But we need to think about how clearly describe them in the documentation.
>
>
> Updated 0001-v13 attached, as well as the diff between v12 and v13.
>
> Thank you)
>
> And I agree with your changes. And included them in patches.
>
> ---
> Regards,
> Alena Rybakina
> Postgres Professional

Hello!
After a brief glance, I think this patch set is good.
But there isn't any more time in the current CF to commit this :(.
So I moved to the next CF.

I also like the 0001 commit message. This commit message is quite
large and easy to understand. Actually, it might be too big. Perhaps
rather of being a commit message, the final paragraph (pages_frozen -
number of pages that..) need to be a part of the document. Perhaps
delete the explanation on pages_frozen that we have in 0004?

--
Best regards,
Kirill Reshke

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kirill Reshke 2024-11-30 06:23:08 Re: Enhance file_fdw to report processed and skipped tuples in COPY progress
Previous Message Michael Paquier 2024-11-30 03:58:44 Re: Memory leak in WAL sender with pgoutput (v10~)