Re: About to add WAL write/fsync statistics to pg_stat_wal view

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Masahiro Ikeda <ikedamsh(at)oss(dot)nttdata(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Li Japin <japinli(at)hotmail(dot)com>
Subject: Re: About to add WAL write/fsync statistics to pg_stat_wal view
Date: 2021-01-22 05:50:20
Message-ID: CAD21AoB=RCYME3+iroY+8TrC9tsHTPo7kRKCpU7C_kZ8kgGW2A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Dec 25, 2020 at 6:46 PM Masahiro Ikeda <ikedamsh(at)oss(dot)nttdata(dot)com> wrote:
>
> Hi,
>
> I rebased the patch to the master branch.

Thank you for working on this. I've read the latest patch. Here are comments:

---
+ if (track_wal_io_timing)
+ {
+ INSTR_TIME_SET_CURRENT(duration);
+ INSTR_TIME_SUBTRACT(duration, start);
+ WalStats.m_wal_write_time +=
INSTR_TIME_GET_MILLISEC(duration);
+ }

* I think it should add the time in micro sec.

After running pgbench with track_wal_io_timing = on for 30 sec,
pg_stat_wal showed the following on my environment:

postgres(1:61569)=# select * from pg_stat_wal;
-[ RECORD 1 ]----+-----------------------------
wal_records | 285947
wal_fpi | 53285
wal_bytes | 442008213
wal_buffers_full | 0
wal_write | 25516
wal_write_time | 0
wal_sync | 25437
wal_sync_time | 14490
stats_reset | 2021-01-22 10:56:13.29464+09

Since writes can complete less than a millisecond, wal_write_time
didn't increase. I think sync_time could also have the same problem.

---
+ /*
+ * Measure i/o timing to fsync WAL data.
+ *
+ * The wal receiver skip to collect it to avoid performance
degradation of standy servers.
+ * If sync_method doesn't have its fsync method, to skip too.
+ */
+ if (!AmWalReceiverProcess() && track_wal_io_timing && fsyncMethodCalled())
+ INSTR_TIME_SET_CURRENT(start);

* Why does only the wal receiver skip it even if track_wal_io_timinig
is true? I think the performance degradation is also true for backend
processes. If there is another reason for that, I think it's better to
mention in both the doc and comment.

* How about checking track_wal_io_timing first?

* s/standy/standby/

---
+ /* increment the i/o timing and the number of times to fsync WAL data */
+ if (fsyncMethodCalled())
+ {
+ if (!AmWalReceiverProcess() && track_wal_io_timing)
+ {
+ INSTR_TIME_SET_CURRENT(duration);
+ INSTR_TIME_SUBTRACT(duration, start);
+ WalStats.m_wal_sync_time += INSTR_TIME_GET_MILLISEC(duration);
+ }
+
+ WalStats.m_wal_sync++;
+ }

* I'd avoid always calling fsyncMethodCalled() in this path. How about
incrementing m_wal_sync after each sync operation?

---
+/*
+ * Check if fsync mothod is called.
+ */
+static bool
+fsyncMethodCalled()
+{
+ if (!enableFsync)
+ return false;
+
+ switch (sync_method)
+ {
+ case SYNC_METHOD_FSYNC:
+ case SYNC_METHOD_FSYNC_WRITETHROUGH:
+ case SYNC_METHOD_FDATASYNC:
+ return true;
+ default:
+ /* others don't have a specific fsync method */
+ return false;
+ }
+}

* I'm concerned that the function name could confuse the reader
because it's called even before the fsync method is called. As I
commented above, calling to fsyncMethodCalled() can be eliminated.
That way, this function is called at only once. So do we really need
this function?

* As far as I read the code, issue_xlog_fsync() seems to do fsync even
if enableFsync is false. Why does the function return false in that
case? I might be missing something.

* void is missing as argument?

* s/mothod/method/

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ian Lawrence Barwick 2021-01-22 05:50:49 Re: POC: postgres_fdw insert batching
Previous Message Amit Kapila 2021-01-22 05:48:53 Re: Parallel INSERT (INTO ... SELECT ...)