Re: About to add WAL write/fsync statistics to pg_stat_wal view

From: Masahiro Ikeda <ikedamsh(at)oss(dot)nttdata(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Li Japin <japinli(at)hotmail(dot)com>, kuroda(dot)hayato(at)fujitsu(dot)com
Subject: Re: About to add WAL write/fsync statistics to pg_stat_wal view
Date: 2021-01-25 23:37:36
Message-ID: 66f9cd47b2fc0e6b996a44a7aef4c6b0@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2021-01-26 00:03, Masahiko Sawada wrote:
> On Mon, Jan 25, 2021 at 4:51 PM Masahiro Ikeda
> <ikedamsh(at)oss(dot)nttdata(dot)com> wrote:
>>
>> Hi, thanks for the reviews.
>>
>> I updated the attached patch.
>
> Thank you for updating the patch!
>
>> The summary of the changes is following.
>>
>> 1. fix document
>>
>> I followed another view's comments.
>>
>>
>> 2. refactor issue_xlog_fsync()
>>
>> I removed "sync_called" variables, narrowed the "duration" scope and
>> change the switch statement to if statement.
>
> Looking at the code again, I think if we check if an fsync was really
> called when calculating the I/O time, it's better to check that before
> starting the measurement.
>
> bool issue_fsync = false;
>
> if (enableFsync &&
> (sync_method == SYNC_METHOD_FSYNC ||
> sync_method == SYNC_METHOD_FSYNC_WRITETHROUGH ||
> sync_method == SYNC_METHOD_FDATASYNC))
> {
> if (track_wal_io_timing)
> INSTR_TIME_SET_CURRENT(start);
> issue_fsync = true;
> }
> (snip)
> if (issue_fsync)
> {
> if (track_wal_io_timing)
> {
> instr_time duration;
>
> INSTR_TIME_SET_CURRENT(duration);
> INSTR_TIME_SUBTRACT(duration, start);
> WalStats.m_wal_sync_time =
> INSTR_TIME_GET_MICROSEC(duration);
> }
> WalStats.m_wal_sync++;
> }
>
> So I prefer either the above which is a modified version of the
> original approach or my idea that doesn’t introduce a new local
> variable I proposed before. But I'm not going to insist on that.

Thanks for the comments.
I change the code to the above.

>>
>>
>> 3. make wal-receiver report WAL statistics
>>
>> I add the code to collect the statistics for a written operation
>> in XLogWalRcvWrite() and to report stats in WalReceiverMain().
>>
>> Since WalReceiverMain() can loop fast, to avoid loading stats
>> collector,
>> I add "force" argument to the pgstat_send_wal function. If "force" is
>> false, it can skip reporting until at least 500 msec since it last
>> reported. WalReceiverMain() almost calls pgstat_send_wal() with
>> "force"
>> as false.
>
> void
> -pgstat_send_wal(void)
> +pgstat_send_wal(bool force)
> {
> /* We assume this initializes to zeroes */
> static const PgStat_MsgWal all_zeroes;
> + static TimestampTz last_report = 0;
>
> + TimestampTz now;
> WalUsage walusage;
>
> + /*
> + * Don't send a message unless it's been at least
> PGSTAT_STAT_INTERVAL
> + * msec since we last sent one or specified "force".
> + */
> + now = GetCurrentTimestamp();
> + if (!force &&
> + !TimestampDifferenceExceeds(last_report, now,
> PGSTAT_STAT_INTERVAL))
> + return;
> +
> + last_report = now;
>
> Hmm, I don’t think it's good to use PGSTAT_STAT_INTERVAL for this
> purpose since it is used as a minimum time for stats file updates. If
> we want an interval, I think we should define another one Also, with
> the patch, pgstat_send_wal() calls GetCurrentTimestamp() every time
> even if track_wal_io_timing is off, which is not good. On the other
> hand, I agree that your concern that the wal receiver should not send
> the stats for whenever receiving wal records. So an idea could be to
> send the wal stats when finishing the current WAL segment file and
> when timeout in the main loop. That way we can guarantee that the wal
> stats on a replica is updated at least every time finishing a WAL
> segment file when actively receiving WAL records and every
> NAPTIME_PER_CYCLE in other cases.

I agree with your comments. I think it should report when
reaching the end of WAL too. I add the code to report the stats
when finishing the current WAL segment file when timeout in the
main loop and when reaching the end of WAL.

Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION

Attachment Content-Type Size
v6_v7.diff text/x-diff 11.8 KB
v7-0001-Add-statistics-related-to-write-sync-wal-records.patch text/x-diff 18.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David G. Johnston 2021-01-25 23:48:09 Re: About to add WAL write/fsync statistics to pg_stat_wal view
Previous Message Alvaro Herrera 2021-01-25 23:12:01 Re: Key management with tests