Re: Add statistics to pg_stat_wal view for wal related parameter tuning

From: Masahiro Ikeda <ikedamsh(at)oss(dot)nttdata(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add statistics to pg_stat_wal view for wal related parameter tuning
Date: 2021-01-20 03:48:27
Message-ID: a75afbc238f501244855aebba728e243@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2020-12-22 11:16, Masahiro Ikeda wrote:
> Thanks for your comments.
>
> On 2020-12-22 09:39, Andres Freund wrote:
>> Hi,
>>
>> On 2020-12-21 13:16:50 -0800, Andres Freund wrote:
>>> On 2020-12-02 13:52:43 +0900, Fujii Masao wrote:
>>> > Pushed. Thanks!
>>>
>>> Why are wal_records/fpi long, instead of uint64?
>>> long wal_records; /* # of WAL records produced */
>>> long wal_fpi; /* # of WAL full page images produced */
>>> uint64 wal_bytes; /* size of WAL records produced */
>>>
>>> long is only 4 byte e.g. on windows, and it is entirely possible to
>>> wrap
>>> a 4 byte record counter. It's also somewhat weird that wal_bytes is
>>> unsigned, but the others are signed?
>>>
>>> This is made doubly weird because on the SQL level you chose to make
>>> wal_records, wal_fpi bigint. And wal_bytes numeric?
>
> I'm sorry I don't know the reason.
>
> The following thread is related to the patch and the type of wal_bytes
> is changed from long to uint64 because XLogRecPtr is uint64.
> https://www.postgresql.org/message-id/flat/20200402144438.GF64485%40nol#1f0127c98df430104c63426fdc285c20
>
> I assumed that the reason why the type of wal_records/fpi is long
> is BufferUsage have the members (i.e, shared_blks_hit) of the same
> types.
>
> So, I think it's better if to change the type of wal_records/fpi from
> long to uint64,
> to change the types of BufferUsage's members too.

I've done a little more research so I'll share the results.

IUCC, theoretically this leads to caliculate the statistics less,
but actually, it's not happened.

The above "wal_records", "wal_fpi" are accumulation values and when
WalUsageAccumDiff()
is called, we can know how many wals are generated for specific
executions,
for example, planning/executing a query, processing a utility command,
and vacuuming one relation.

The following variable has accumulated "wal_records" and "wal_fpi" per
process.

```
typedef struct WalUsage
{
long wal_records; /* # of WAL records produced */
long wal_fpi; /* # of WAL full page images produced */
uint64 wal_bytes; /* size of WAL records produced */
} WalUsage;

WalUsage pgWalUsage;
```

Although this may be overflow, it doesn't affect to caliculate the
difference
of wal usage between some execution points. If to generate over 2
billion wal
records per executions, 4 bytes is not enough and collected statistics
will be
lost, but I think it's not happened.

In addition, "wal_records" and "wal_fpi" values sent by processes are
accumulated in the statistic collector and their types are
PgStat_Counter(int64).

```
typedef struct PgStat_WalStats
{
PgStat_Counter wal_records;
PgStat_Counter wal_fpi;
uint64 wal_bytes;
PgStat_Counter wal_buffers_full;
TimestampTz stat_reset_timestamp;
} PgStat_WalStats;
```

>> Some more things:
>> - There's both PgStat_MsgWal WalStats; and static PgStat_WalStats
>> walStats;
>> that seems *WAY* too confusing. And the former imo shouldn't be
>> global.
>
> Sorry for the confusing name.
> I referenced the following variable name.
>
> static PgStat_MsgSLRU SLRUStats[SLRU_NUM_ELEMENTS];
> static PgStat_SLRUStats slruStats[SLRU_NUM_ELEMENTS];
>
> How about to change from "PgStat_MsgWal WalStats"
> to "PgStat_MsgWal MsgWalStats"?
>
> Is it better to change the following name too?
> "PgStat_MsgBgWriter BgWriterStats;"
> "static PgStat_MsgSLRU SLRUStats[SLRU_NUM_ELEMENTS];"
>
> Since PgStat_MsgWal is called in xlog.c and pgstat.c,
> I thought it's should be global.

I made an attached patch to rename the above variable names.
What do you think?

Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION

Attachment Content-Type Size
0001-Refactor-variable-names-of-global-statistics-message.patch text/x-diff 9.4 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-01-20 03:54:35 Re: Deleting older versions in unique indexes to avoid page splits
Previous Message japin 2021-01-20 02:59:27 Re: Change default of checkpoint_completion_target