Re: wal stats questions

From: Masahiro Ikeda <ikedamsh(at)oss(dot)nttdata(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, masao(dot)fujii(at)oss(dot)nttdata(dot)com, andres(at)anarazel(dot)de
Subject: Re: wal stats questions
Date: 2021-03-30 11:37:36
Message-ID: 2edded96-5e8b-987c-c01a-16e1542dfec9@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On 2021/03/30 17:28, Kyotaro Horiguchi wrote:
> At Tue, 30 Mar 2021 09:41:24 +0900, Masahiro Ikeda <ikedamsh(at)oss(dot)nttdata(dot)com> wrote in
>> I update the patch since there were my misunderstanding points.
>>
>> On 2021/03/26 16:20, Masahiro Ikeda wrote:
>>> Thanks for many your suggestions!
>>> I made the patch to handle the issues.
>>>
>>>> 1) What is the motivation to have both prevWalUsage and pgWalUsage,
>>>> instead of just accumulating the stats since the last submission?
>>>> There doesn't seem to be any comment explaining it? Computing
>>>> differences to previous values, and copying to prev*, isn't free. I
>>>> assume this is out of a fear that the stats could get reset before
>>>> they're used for instrument.c purposes - but who knows?
>>>
>>> I removed the unnecessary code copying pgWalUsage and just reset the
>>> pgWalUsage after reporting the stats in pgstat_report_wal().
>>
>> I didn't change this.
>>
>>>> 2) Why is there both pgstat_send_wal() and pgstat_report_wal()? With the
>>>> former being used by wal writer, the latter by most other processes?
>>>> There again don't seem to be comments explaining this.
>>>
>>> I added the comments why two functions are separated.
>>> (But is it better to merge them?)
>>
>> I updated the comments for following reasons.
>>
>>
>>>> 3) Doing if (memcmp(&WalStats, &all_zeroes, sizeof(PgStat_MsgWal)) == 0)
>>>> just to figure out if there's been any changes isn't all that
>>>> cheap. This is regularly exercised in read-only workloads too. Seems
>>>> adding a boolean WalStats.have_pending = true or such would be
>>>> better.
>>>> 4) For plain backends pgstat_report_wal() is called by
>>>> pgstat_report_stat() - but it is not checked as part of the "Don't
>>>> expend a clock check if nothing to do" check at the top. It's
>>>> probably rare to have pending wal stats without also passing one of
>>>> the other conditions, but ...
>>>
>>> I added the logic to check if the stats counters are updated or not in
>>> pgstat_report_stat() using not only generated wal record but also write/sync
>>> counters, and it can skip to call reporting function.
>>
>> I removed the checking code whether the wal stats counters were updated or not
>> in pgstat_report_stat() since I couldn't understand the valid case yet.
>> pgstat_report_stat() is called by backends when the transaction is finished.
>> This means that the condition seems always pass.
>
> Doesn't the same holds for all other counters? If you are saying that
> "wal counters should be zero if all other stats counters are zero", we
> need an assertion to check that and a comment to explain that.
>
> I personally find it safer to add the WAL-stats condition to the
> fast-return check, rather than addin such assertion.
Thanks for your comments.

OK, I added the condition to the fast-return check. I noticed that I
misunderstood that the purpose is to avoid expanding a clock check using WAL
stats counters. But, the purpose is to make the conditions stricter, right?

> pgstat_send_wal() is called mainly from pgstat_report_wal() which
> consumes pgWalStats counters and WalWriterMain() which
> doesn't. Checking on pgWalStats counters isn't so complex that we need
> to avoid that in wal writer, thus *I* think pgstat_send_wal() and
> pgstat_report_wal() can be consolidated. Even if you have a strong
> opinion that wal writer should call a separate function, the name
> should be something other than pgstat_send_wal() since it ignores
> pgWalUsage counters, which are supposed to be included in "WAL stats".

OK, I consolidated them.

>> I thought to implement if the WAL stats counters were not updated, skip to
>> send all statistics including the counters for databases and so on. But I
>> think it's not good because it may take more time to be reflected the
>> generated stats by read-only transaction.
>
> Ur, yep.
>
>>> Although I added the condition which the write/sync counters are updated or
>>> not, I haven't understood the following case yet...Sorry. I checked related
>>> code and tested to insert large object, but I couldn't. I'll investigate more
>>> deeply, but if you already know the function which leads the following case,
>>> please let me know.
>>
>> I understood the above case (Fujii-san, thanks for your advice in person).
>> When to flush buffers, for example, to select buffer replacement victim,
>> it requires a WAL flush if the buffer is dirty. So, to check the WAL stats
>> counters are updated or not, I check the number of generated wal record and
>> the counter of syncing in pgstat_report_wal().
>>
>> The reason why not to check the counter of writing is that if to write is
>> happened, to sync is happened too in the above case. I added the comments in
>> the patch.
>
> Mmm.. Although I couldn't read you clearly, The fact that a flush may
> happen without a write means the reverse at the same time, a write may
> happen without a flush. For asynchronous commits, WAL-write happens
> alone unaccompanied by a flush. And the corresponding flush would
> happen later without writes.

Sorry, I didn't explain it enough.

For processes which may generate WAL records like backends, I thought it's
enough to check (1)the number of generated WAL records and (2)the counters of
syncing(flushing) the WAL. This is checked in pgstat_report_wal(). Sorry for
that I didn't mention (1) in the above thread.

If a backend execute a write transaction, some WAL records must be generated.
So, it's ok to check (1) only regardless of whether asynchronous commit is
enabled or not.

OHOT, if a backend execute a read-only transaction, WAL records won't be
generated (although HOT makes a wal records exactly...). But, WAL-write and
flush may happen when to flush buffers via XLogFlush(). In this case, if
WAL-write happened, flush must be happen later. But, if my understanding is
correct, there is no a case to flush doesn't happen, but to write happen.
So, I thought (2) is needed and it's enough to check the counter of
syncing(flushing).

Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION

Attachment Content-Type Size
v3-0001-performance-improvements-of-reporting-wal-stats.patch text/x-patch 7.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2021-03-30 11:40:35 Re: Asynchronous Append on postgres_fdw nodes.
Previous Message Markus Wanner 2021-03-30 11:31:36 Re: [PATCH] add concurrent_abort callback for output plugin