Re: Add statistics to pg_stat_wal view for wal related parameter tuning

From: Masahiro Ikeda <ikedamsh(at)oss(dot)nttdata(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add statistics to pg_stat_wal view for wal related parameter tuning
Date: 2020-10-22 01:09:21
Message-ID: 0720a2f666226ef3be4f4c2e8dea9045@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2020-10-21 13:41, Amit Kapila wrote:
> On Tue, Oct 20, 2020 at 12:41 PM Masahiro Ikeda
> <ikedamsh(at)oss(dot)nttdata(dot)com> wrote:
>>
>> On 2020-10-20 12:46, Amit Kapila wrote:
>> > On Tue, Oct 20, 2020 at 8:01 AM Masahiro Ikeda
>> >> 1. Basic statistics of WAL activity
>> >>
>> >> - wal_records: Total number of WAL records generated
>> >> - wal_fpi: Total number of WAL full page images generated
>> >> - wal_bytes: Total amount of WAL bytes generated
>> >>
>> >> To understand DB's performance, first, we will check the performance
>> >> trends for the entire database instance.
>> >> For example, if the number of wal_fpi becomes higher, users may tune
>> >> "wal_compression", "checkpoint_timeout" and so on.
>> >>
>> >> Although users can check the above statistics via EXPLAIN,
>> >> auto_explain,
>> >> autovacuum and pg_stat_statements now,
>> >> if users want to see the performance trends for the entire database,
>> >> they must recalculate the statistics.
>> >>
>> >
>> > Here, do you mean to say 'entire cluster' instead of 'entire database'
>> > because it seems these stats are getting collected for the entire
>> > cluster?
>>
>> Thanks for your comments.
>> Yes, I wanted to say 'entire cluster'.
>>
>> >> I think it is useful to add the sum of the basic statistics.
>> >>
>> >
>> > There is an argument that it is better to view these stats at the
>> > statement-level so that one can know which statements are causing most
>> > WAL and then try to rate-limit them if required in the application and
>> > anyway they can get the aggregate of all the WAL if they want. We have
>> > added these stats in PG-13, so do we have any evidence that the
>> > already added stats don't provide enough information? I understand
>> > that you are trying to display the accumulated stats here which if
>> > required users/DBA need to compute with the currently provided stats.
>> > OTOH, sometimes adding more ways to do some things causes difficulty
>> > for users to understand and learn.
>>
>> I agreed that the statement-level stat is important and I understood
>> that we can
>> know the aggregated WAL stats of pg_stat_statement view and
>> autovacuum's
>> log.
>> But now, WAL stats generated by autovacuum can be output to logs and
>> it
>> is not
>> easy to aggregate them. Since WAL writes impacts for the entire
>> cluster,
>> I thought
>> it's natural to provide accumulated value.
>>
>
> I think it is other way i.e if we would have accumulated stats then it
> makes sense to provide those at statement-level because one would like
> to know the exact cause of more WAL activity. Say it is due to an
> autovacuum or due to the particular set of statements then it would
> easier for users to do something about it.

OK, I'll remove them.
Do you have any comments for other statistics?

--
Masahiro Ikeda
NTT DATA CORPORATION

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2020-10-22 01:25:19 Re: [patch] Fix checksum verification in base backups for zero page headers
Previous Message k.jamison@fujitsu.com 2020-10-22 00:41:48 RE: [Patch] Optimize dropping of relation buffers using dlist