Re: wal stats questions

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: Masahiro Ikeda <ikedamsh(at)oss(dot)nttdata(dot)com>, torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Subject: Re: wal stats questions
Date: 2021-05-17 13:34:33
Message-ID: 458f4d15-7e73-155b-8bb9-af17bb775560@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2021/05/17 16:39, Masahiro Ikeda wrote:
>
> Thanks for your comments!
>
>>> +     * is executed, wal records aren't nomally generated (although HOT makes
>>
>> nomally -> normally?
>
> Yes, fixed.
>
>>> +     * It's not enough to check the number of generated wal records, for
>>> +     * example the walwriter may write/sync the WAL although it doesn't
>>
>> You use both 'wal' and 'WAL' in the comments, but are there any intension?
>
> No, I changed to use 'WAL' only. Although some other comments have 'wal' and
> 'WAL', it seems that 'WAL' is often used.

Thanks for updating the patch!

+ * Buffer and generated WAL usage counters.
+ *
+ * The counters are accumulated values. There are infrastructures
+ * to add the values and calculate the difference within a specific period.

Is it really worth adding these comments here?

+ * Note: regarding to WAL statistics counters, it's not enough to check
+ * only whether the WAL record is generated or not. If a read transaction
+ * is executed, WAL records aren't normally generated (although HOT makes
+ * WAL records). But, just writes and syncs the WAL data may happen when
+ * to flush buffers.

Aren't the following comments better?

------------------------------
To determine whether any WAL activity has occurred since last time, not only the number of generated WAL records but also the numbers of WAL writes and syncs need to be checked. Because even transaction that generates no WAL records can write or sync WAL data when flushing the data pages.
------------------------------

- * This function can be called even if nothing at all has happened. In
- * this case, avoid sending a completely empty message to the stats
- * collector.

I think that it's better to leave this comment as it is.

+ * First, to check the WAL stats counters were updated.
+ *
+ * Even if the 'force' is true, we don't need to send the stats if the
+ * counters were not updated.
+ *
+ * We can know whether the counters were updated or not to check only
+ * three counters. So, for performance, we don't allocate another memory
+ * spaces and check the all stats like pgstat_send_slru().

Is it really worth adding these comments here?

+ * The current 'wal_records' is the same as the previous one means that
+ * WAL records weren't generated. So, the counters of 'wal_fpi',
+ * 'wal_bytes', 'm_wal_buffers_full' are not updated neither.
+ *
+ * It's not enough to check the number of generated WAL records, for
+ * example the walwriter may write/sync the WAL although it doesn't
+ * generate WAL records. 'm_wal_write' and 'm_wal_sync' are zero means the
+ * counters of time spent are zero too.

Aren't the following comments better?

------------------------
Check wal_records counter to determine whether any WAL activity has happened since last time. Note that other WalUsage counters don't need to be checked because they are incremented always together with wal_records counter.

m_wal_buffers_full also doesn't need to be checked because it's incremented only when at least one WAL record is generated (i.e., wal_records counter is incremented). But for safely, we assert that m_wal_buffers_full is always zero when no WAL record is generated

This function can be called by a process like walwriter that normally generates no WAL records. To determine whether any WAL activity has happened at that process since the last time, the numbers of WAL writes and syncs are also checked.
------------------------

+ * The accumulated counters for buffer usage.
+ *
+ * The reason the counters are accumulated values is to avoid unexpected
+ * reset because many callers may use them.

Aren't the following comments better?

------------------------
These counters keep being incremented infinitely, i.e., must never be reset to zero, so that we can calculate how much the counters are incremented in an arbitrary period.
------------------------

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2021-05-17 13:45:04 Re: PG 14 release notes, first draft
Previous Message Amit Langote 2021-05-17 13:30:47 Re: Skip partition tuple routing with constant partition key