Re: New statistics for tuning WAL buffer size

From: Masahiro Ikeda <ikedamsh(at)oss(dot)nttdata(dot)com>
To: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, tsunakawa(dot)takay(at)fujitsu(dot)com, Magnus Hagander <magnus(at)hagander(dot)net>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Subject: Re: New statistics for tuning WAL buffer size
Date: 2020-09-15 06:52:30
Message-ID: 2613797456560d3c0585970c937bf1f3@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2020-09-11 01:40, Fujii Masao wrote:
> On 2020/09/09 13:57, Masahiro Ikeda wrote:
>> On 2020-09-07 16:19, Fujii Masao wrote:
>>> On 2020/09/07 9:58, Masahiro Ikeda wrote:
>>>> Thanks for the review and advice!
>>>>
>>>> On 2020-09-03 16:05, Fujii Masao wrote:
>>>>> On 2020/09/02 18:56, Masahiro Ikeda wrote:
>>>>>>> +/* ----------
>>>>>>> + * Backend types
>>>>>>> + * ----------
>>>>>>>
>>>>>>> You seem to forget to add "*/" into the above comment.
>>>>>>> This issue could cause the following compiler warning.
>>>>>>> ../../src/include/pgstat.h:761:1: warning: '/*' within block
>>>>>>> comment [-Wcomment]
>>>>>>
>>>>>> Thanks for the comment. I fixed.
>>>>>
>>>>> Thanks for the fix! But why are those comments necessary?
>>>>
>>>> Sorry about that. This comment is not necessary.
>>>> I removed it.
>>>>
>>>>>> The pg_stat_walwriter is not security restricted now, so ordinary
>>>>>> users can access it.
>>>>>> It has the same security level as pg_stat_archiver. If you have
>>>>>> any comments, please let me know.
>>>>>
>>>>> +       <structfield>dirty_writes</structfield> <type>bigint</type>
>>>>>
>>>>> I guess that the column name "dirty_writes" derived from
>>>>> the DTrace probe name. Isn't this name confusing? We should
>>>>> rename it to "wal_buffers_full" or something?
>>>>
>>>> I agree and rename it to "wal_buffers_full".
>>>>
>>>>> +/* ----------
>>>>> + * PgStat_MsgWalWriter            Sent by the walwriter to update
>>>>> statistics.
>>>>>
>>>>> This comment seems not accurate because backends also send it.
>>>>>
>>>>> +/*
>>>>> + * WAL writes statistics counter is updated in XLogWrite function
>>>>> + */
>>>>> +extern PgStat_MsgWalWriter WalWriterStats;
>>>>>
>>>>> This comment seems not right because the counter is not updated in
>>>>> XLogWrite().
>>>>
>>>> Right. I fixed it to "Sent by each backend and background workers to
>>>> update WAL statistics."
>>>> In the future, other statistics will be included so I remove the
>>>> function's name.
>>>>
>>>>
>>>>> +-- There will surely and maximum one record
>>>>> +select count(*) = 1 as ok from pg_stat_walwriter;
>>>>>
>>>>> What about changing this comment to "There must be only one
>>>>> record"?
>>>>
>>>> Thanks, I fixed.
>>>>
>>>>> +                    WalWriterStats.m_xlog_dirty_writes++;
>>>>>                      LWLockRelease(WALWriteLock);
>>>>>
>>>>> Since WalWriterStats.m_xlog_dirty_writes doesn't need to be
>>>>> protected
>>>>> with WALWriteLock, isn't it better to increment that after
>>>>> releasing the lock?
>>>>
>>>> Thanks, I fixed.
>>>>
>>>>> +CREATE VIEW pg_stat_walwriter AS
>>>>> +    SELECT
>>>>> +        pg_stat_get_xlog_dirty_writes() AS dirty_writes,
>>>>> +        pg_stat_get_walwriter_stat_reset_time() AS stats_reset;
>>>>> +
>>>>>  CREATE VIEW pg_stat_progress_vacuum AS
>>>>>
>>>>> In system_views.sql, the definition of pg_stat_walwriter should be
>>>>> placed just after that of pg_stat_bgwriter not
>>>>> pg_stat_progress_analyze.
>>>>
>>>> OK, I fixed it.
>>>>
>>>>>      }
>>>>> -
>>>>>      /*
>>>>>       * We found an existing collector stats file. Read it and put
>>>>> all the
>>>>>
>>>>> You seem to accidentally have removed the empty line here.
>>>>
>>>> Sorry about that. I fixed it.
>>>>
>>>>> -                 errhint("Target must be \"archiver\" or
>>>>> \"bgwriter\".")));
>>>>> +                 errhint("Target must be \"archiver\" or
>>>>> \"bgwriter\" or
>>>>> \"walwriter\".")));
>>>>>
>>>>> There are two "or" in the message, but the former should be
>>>>> replaced with ","?
>>>>
>>>> Thanks, I fixed.
>>>>
>>>>
>>>> On 2020-09-05 18:40, Magnus Hagander wrote:
>>>>> On Fri, Sep 4, 2020 at 5:42 AM Fujii Masao
>>>>> <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
>>>>>
>>>>>> On 2020/09/04 11:50, tsunakawa(dot)takay(at)fujitsu(dot)com wrote:
>>>>>>> From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
>>>>>>>>> I changed the view name from pg_stat_walwrites to
>>>>>> pg_stat_walwriter.
>>>>>>>>> I think it is better to match naming scheme with other views
>>>>>> like
>>>>>>>> pg_stat_bgwriter,
>>>>>>>>> which is for bgwriter statistics but it has the statistics
>>>>>> related to backend.
>>>>>>>>
>>>>>>>> I prefer the view name pg_stat_walwriter for the consistency
>>>>>>>> with
>>>>>>>> other view names. But we also have pg_stat_wal_receiver. Which
>>>>>>>> makes me think that maybe pg_stat_wal_writer is better for
>>>>>>>> the consistency. Thought? IMO either of them works for me.
>>>>>>>> I'd like to hear more opinons about this.
>>>>>>>
>>>>>>> I think pg_stat_bgwriter is now a misnomer, because it contains
>>>>>> the backends' activity.  Likewise, pg_stat_walwriter leads to
>>>>>> misunderstanding because its information is not limited to WAL
>>>>>> writer.
>>>>>>>
>>>>>>> How about simply pg_stat_wal?  In the future, we may want to
>>>>>> include WAL reads in this view, e.g. reading undo logs in zheap.
>>>>>>
>>>>>> Sounds reasonable.
>>>>>
>>>>> +1.
>>>>>
>>>>> pg_stat_bgwriter has had the "wrong name" for quite some time now
>>>>> --
>>>>> it became even more apparent when the checkpointer was split out to
>>>>> it's own process, and that's not exactly a recent change. And it
>>>>> had
>>>>> allocs in it from day one...
>>>>>
>>>>> I think naming it for what the data in it is ("wal") rather than
>>>>> which
>>>>> process deals with it ("walwriter") is correct, unless the
>>>>> statistics
>>>>> can be known to only *ever* affect one type of process. (And then
>>>>> different processes can affect different columns in the view). As a
>>>>> general rule -- and that's from what I can tell exactly what's
>>>>> being
>>>>> proposed.
>>>>
>>>> Thanks for your comments. I agree with your opinions.
>>>> I changed the view name to "pg_stat_wal".
>>>>
>>>>
>>>> I fixed the code to send the WAL statistics from not only backend
>>>> and walwriter
>>>> but also checkpointer, walsender and autovacuum worker.
>>>
>>> Good point! Thanks for updating the patch!
>>>
>>>
>>> @@ -604,6 +604,7 @@ heap_vacuum_rel(Relation onerel, VacuumParams
>>> *params,
>>>                           onerel->rd_rel->relisshared,
>>>                           Max(new_live_tuples, 0),
>>>                           vacrelstats->new_dead_tuples);
>>> +    pgstat_send_wal();
>>>
>>> I guess that you changed heap_vacuum_rel() as above so that
>>> autovacuum
>>> workers can send WAL stats. But heap_vacuum_rel() can be called by
>>> the processes (e.g., backends) other than autovacuum workers? Also
>>> what happens if autovacuum workers just do ANALYZE only? In that
>>> case,
>>> heap_vacuum_rel() may not be called.
>>>
>>> Currently autovacuum worker reports the stats at the exit via
>>> pgstat_beshutdown_hook(). Unlike other processes, autovacuum worker
>>> is not the process that basically keeps running during the service.
>>> It exits
>>> after it does vacuum or analyze. So ISTM that it's not bad to report
>>> the stats
>>> only at the exit, in autovacuum worker case. There is no need to add
>>> extra
>>> code for WAL stats report by autovacuum worker. Thought?
>>
>> Thanks, I understood. I removed this code.
>>
>>>
>>> @@ -1430,6 +1430,9 @@ WalSndWaitForWal(XLogRecPtr loc)
>>>          else
>>>              RecentFlushPtr = GetXLogReplayRecPtr(NULL);
>>>  +        /* Send wal statistics */
>>> +        pgstat_send_wal();
>>>
>>> AFAIR logical walsender uses three loops in WalSndLoop(),
>>> WalSndWriteData()
>>> and WalSndWaitForWal(). But could you tell me why added
>>> pgstat_send_wal()
>>> into WalSndWaitForWal()? I'd like to know why WalSndWaitForWal() is
>>> the best
>>> for that purpose.
>>
>> I checked what function calls XLogBackgroundFlush() which calls
>> AdvanceXLInsertBuffer() to increment m_wal_buffers_full.
>>
>> I found that WalSndWaitForWal() calls it, so I added it.
>
> Ok. But XLogBackgroundFlush() calls AdvanceXLInsertBuffer() wit the
> second argument opportunistic=true, so in this case WAL write by
> wal_buffers full seems to never happen. Right? If this understanding
> is right, WalSndWaitForWal() doesn't need to call pgstat_send_wal().
> Probably also walwriter doesn't need to do that.
>
> The logical rep walsender can generate WAL and call
> AdvanceXLInsertBuffer() when it executes the replication commands like
> CREATE_REPLICATION_SLOT. But this case is already covered by
> pgstat_report_activity()->pgstat_send_wal() called in PostgresMain(),
> with your patch. So no more calls to pgstat_send_wal() seems necessary
> for logical rep walsender.

Thanks for your reviews. I didn't notice that.
I updated the patches.

On 2020-09-11 17:13, Fujii Masao wrote:
> On 2020/09/11 16:54, Kyotaro Horiguchi wrote:
>> At Fri, 11 Sep 2020 13:48:49 +0900, Fujii Masao
>> <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote in
>>>
>>>
>>> On 2020/09/11 12:17, Kyotaro Horiguchi wrote:
>>>> Hello.
>>>> At Wed, 09 Sep 2020 13:57:37 +0900, Masahiro Ikeda
>>>> <ikedamsh(at)oss(dot)nttdata(dot)com> wrote in
>>>>> I checked what function calls XLogBackgroundFlush() which calls
>>>>> AdvanceXLInsertBuffer() to increment m_wal_buffers_full.
>>>>>
>>>>> I found that WalSndWaitForWal() calls it, so I added it.
>>>>> Is it better to move it in WalSndLoop() like the attached patch?
>>>> By the way, we are counting some wal-related numbers in
>>>> pgWalUsage.(bytes, records, fpi). Since now that we are going to
>>>> have
>>>> a new view related to WAL statistics, wouln't it be more useful to
>>>> show them together in the view?
>>>
>>> Probably yes. But IMO it's better to commit the current patch first,
>>> and then add those stats into the view after confirming exposing them
>>> is useful.
>>
>> I'm fine with that.
>>
>>> BTW, to expose the total WAL bytes, I think it's better to just save
>>> the LSN at when pg_stat_wal is reset rather than counting
>>> pgWalUsage.bytes. If we do that, we can easily total WAL bytes by
>>> subtracting that LSN from the latest LSN. Also saving the LSN at the
>>> reset timing causes obviously less overhead than counting
>>> pgWalUsage.bytes.
>>
>> pgWalUsage is always counting so it doesn't add any overhead.
>
> Yes. And I'm a bit concerned about the overhead by frequent message
> sent for WAL bytes to the stats collector.

Thanks for the comments.
I agree that we need to add more wal-related statistics
after this patch is committed.

Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION

Attachment Content-Type Size
0006_pg_stat_wal_view.patch text/x-diff 18.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2020-09-15 06:52:32 Re: logtape.c stats don't account for unused "prefetched" block numbers
Previous Message Jeff Davis 2020-09-15 06:44:47 Re: logtape.c stats don't account for unused "prefetched" block numbers