Re: About to add WAL write/fsync statistics to pg_stat_wal view

From: Masahiro Ikeda <ikedamsh(at)oss(dot)nttdata(dot)com>
To: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
Cc: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Li Japin <japinli(at)hotmail(dot)com>, kuroda(dot)hayato(at)fujitsu(dot)com
Subject: Re: About to add WAL write/fsync statistics to pg_stat_wal view
Date: 2021-03-15 01:54:01
Message-ID: c616cf24bf4ecd818f7cab0900a40842@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>> On 2021/03/11 21:29, Masahiro Ikeda wrote:
>>> On 2021-03-11 11:52, Fujii Masao wrote:
>>>> On 2021/03/11 9:38, Masahiro Ikeda wrote:
>>>>> On 2021-03-10 17:08, Fujii Masao wrote:
>>>>>> IIUC the stats collector basically exits after checkpointer and
>>>>>> walwriter exit.
>>>>>> But there seems no guarantee that the stats collector processes
>>>>>> all the messages that other processes have sent during the
>>>>>> shutdown of
>>>>>> the server.
>>>>>
>>>>> Thanks, I understood the above postmaster behaviors.
>>>>>
>>>>> PMState manages the status and after checkpointer is exited, the
>>>>> postmaster sends
>>>>> SIGQUIT signal to the stats collector if the shutdown mode is smart
>>>>> or fast.
>>>>> (IIUC, although the postmaster kill the walsender, the archiver and
>>>>> the stats collector at the same time, it's ok because the walsender
>>>>> and the archiver doesn't send stats to the stats collector now.)
>>>>>
>>>>> But, there might be a corner case to lose stats sent by background
>>>>> workers like
>>>>> the checkpointer before they exit (although this is not implemented
>>>>> yet.)
>>>>>
>>>>> For example,
>>>>>
>>>>> 1. checkpointer send the stats before it exit
>>>>> 2. stats collector receive the signal and break before processing
>>>>>     the stats message from checkpointer. In this case, 1's message
>>>>> is lost.
>>>>> 3. stats collector writes the stats in the statsfiles and exit
>>>>>
>>>>> Why don't you recheck the coming message is zero just before the
>>>>> 2th procedure?
>>>>> (v17-0004-guarantee-to-collect-last-stats-messages.patch)
>>>>
>>>> Yes, I was thinking the same. This is the straight-forward fix for
>>>> this issue.
>>>> The stats collector should process all the outstanding messages when
>>>> normal shutdown is requested, as the patch does. On the other hand,
>>>> if immediate shutdown is requested or emergency bailout (by
>>>> postmaster death)
>>>> is requested, maybe the stats collector should skip those
>>>> processings
>>>> and exit immediately.
>>>>
>>>> But if we implement that, we would need to teach the stats collector
>>>> the shutdown type (i.e., normal shutdown or immediate one). Because
>>>> currently SIGQUIT is sent to the collector whichever shutdown is
>>>> requested,
>>>> and so the collector cannot distinguish the shutdown type. I'm
>>>> afraid that
>>>> change is a bit overkill for now.
>>>>
>>>> BTW, I found that the collector calls pgstat_write_statsfiles() even
>>>> at
>>>> emergency bailout case, before exiting. It's not necessary to save
>>>> the stats to the file in that case because subsequent server startup
>>>> does
>>>> crash recovery and clears that stats file. So it's better to make
>>>> the collector exit immediately without calling
>>>> pgstat_write_statsfiles()
>>>> at emergency bailout case? Probably this should be discussed in
>>>> other
>>>> thread because it's different topic from the feature we're
>>>> discussing here,
>>>> though.
>>>
>>> IIUC, only the stats collector has another hander for SIGQUIT
>>> although
>>> other background processes have a common hander for it and just call
>>> _exit(2).
>>> I thought to guarantee when TerminateChildren(SIGTERM) is invoked,
>>> don't make stats
>>> collector shutdown before other background processes are shutdown.
>>>
>>> I will make another thread to discuss that the stats collector should
>>> know the shutdown type or not.
>>> If it should be, it's better to make the stats collector exit as soon
>>> as possible if the shutdown type
>>> is an immediate, and avoid losing the remaining stats if it's normal.
>>
>> +1

I researched the past discussion related to writing the stats files when
the immediate
shutdown is requested. And I found the following thread([1]) although
the discussion is
stopped on 12/1/2016.

IIUC, the thread's consensus are

(1) To kill the stats collector soon before writing the stats file is
needed in some case
because there is a possibility that it takes a long time until the
failover happens.
The possible reasons are that disk write speed is slow, stats files
are big, and so on.

(2) It needs to change the behavior from removing all stats files when
the startup does
crash recovery because autovacuum uses the stats.

(3) It's ok that the stats collector exit without calling
pgstat_write_statsfiles() if
the stats file is written every X minutes (using wal or another
mechanism) and startup
process can restore the stats with slightly low freshness.

(4) It needs to find the way how to handle the (2)'s stats file when
deleting on PITR
rewind or stats collector crash happens.

So, I need to ping the threads. But I don't have any idea to handle (4)
yet...

[1]
https://www.postgresql.org/message-id/flat/0A3221C70F24FB45833433255569204D1F5EF25A%40G01JPEXMBYT05

>>
>>>
>>>
>>>
>>>>> I measured the timing of the above in my linux laptop using
>>>>> v17-measure-timing.patch.
>>>>> I don't have any strong opinion to handle this case since this
>>>>> result shows to receive and processes
>>>>> the messages takes too short time (less than 1ms) although the
>>>>> stats collector receives the shutdown
>>>>> signal in 5msec(099->104) after the checkpointer process exits.
>>>>
>>>> Agreed.
>>>>
>>>>>
>>>>> ```
>>>>> 1615421204.556 [checkpointer] DEBUG:  received shutdown request
>>>>> signal
>>>>> 1615421208.099 [checkpointer] DEBUG:  proc_exit(-1): 0 callbacks to
>>>>> make              # exit and send the messages
>>>>> 1615421208.099 [stats collector] DEBUG:  process BGWRITER stats
>>>>> message              # receive and process the messages
>>>>> 1615421208.099 [stats collector] DEBUG:  process WAL stats message
>>>>> 1615421208.104 [postmaster] DEBUG:  reaping dead processes
>>>>> 1615421208.104 [stats collector] DEBUG:  received shutdown request
>>>>> signal             # receive shutdown request from the postmaster
>>>>> ```
>>>>>
>>>>>>>> Of course, there is another direction; we can improve the stats
>>>>>>>> collector so
>>>>>>>> that it guarantees to collect all the sent stats messages. But
>>>>>>>> I'm afraid
>>>>>>>> this change might be big.
>>>>>>>
>>>>>>> For example, implement to manage background process status in
>>>>>>> shared memory and
>>>>>>> the stats collector collects the stats until another background
>>>>>>> process exits?
>>>>>>>
>>>>>>> In my understanding, the statistics are not required high
>>>>>>> accuracy,
>>>>>>> it's ok to ignore them if the impact is not big.
>>>>>>>
>>>>>>> If we guarantee high accuracy, another background process like
>>>>>>> autovacuum launcher
>>>>>>> must send the WAL stats because it accesses the system catalog
>>>>>>> and might generate
>>>>>>> WAL records due to HOT update even though the possibility is low.
>>>>>>>
>>>>>>> I thought the impact is small because the time uncollected stats
>>>>>>> are generated is
>>>>>>> short compared to the time from startup. So, it's ok to ignore
>>>>>>> the remaining stats
>>>>>>> when the process exists.
>>>>>>
>>>>>> I agree that it's not worth changing lots of code to collect such
>>>>>> stats.
>>>>>> But if we can implement that very simply, isn't it more worth
>>>>>> doing
>>>>>> that than current situation because we may be able to collect more
>>>>>> accurate stats.
>>>>>
>>>>> Yes, I agree.
>>>>> I attached the patch to send the stats before the wal writer and
>>>>> the checkpointer exit.
>>>>> (v17-0001-send-stats-for-walwriter-when-shutdown.patch,
>>>>> v17-0002-send-stats-for-checkpointer-when-shutdown.patch)
>>>>
>>>> Thanks for making those patches! Firstly I'm reading 0001 and 0002
>>>> patches.
>>>
>>> Thanks for your comments and for making patches.
>>>
>>>
>>>> Here is the review comments for 0001 patch.
>>>>
>>>> +/* Prototypes for private functions */
>>>> +static void HandleWalWriterInterrupts(void);
>>>>
>>>> HandleWalWriterInterrupts() and HandleMainLoopInterrupts() are
>>>> almost the same.
>>>> So I don't think that we need to introduce
>>>> HandleWalWriterInterrupts(). Instead,
>>>> we can just call pgstat_send_wal(true) before
>>>> HandleMainLoopInterrupts()
>>>> if ShutdownRequestPending is true in the main loop. Attached is the
>>>> patch
>>>> I implemented that way. Thought?
>>>
>>> I thought there is a corner case that can't send the stats like
>>
>> You're right! So IMO your patch
>> (v17-0001-send-stats-for-walwriter-when-shutdown.patch) is better.
>>>
>>> ```
>>> // First, ShutdownRequstPending = false
>>>
>>>      if (ShutdownRequestPending)    // don't send the stats
>>>          pgstat_send_wal(true);
>>>
>>> // receive signal and ShutdownRequestPending became true
>>>
>>>      HandleMainLoopInterrupts();   // proc exit without sending the
>>> stats
>>>
>>> ```
>>>
>>> Is it ok because it almost never occurs?
>>>
>>>
>>>> Here is the review comments for 0002 patch.
>>>>
>>>> +static void pgstat_send_checkpointer(void);
>>>>
>>>> I'm inclined to avoid adding the function with the prefix "pgstat_"
>>>> outside
>>>> pgstat.c. Instead, I'm ok to just call both pgstat_send_bgwriter()
>>>> and
>>>> pgstat_report_wal() directly after ShutdownXLOG(). Thought? Patch
>>>> attached.
>>>
>>> Thanks. I agree.
>>
>> Thanks for the review!
>>
>>
>> So, barring any objection, I will commit the changes for
>> walwriter and checkpointer. That is,
>>
>> v17-0001-send-stats-for-walwriter-when-shutdown.patch
>> v17-0002-send-stats-for-checkpointer-when-shutdown_fujii.patch
>
> I pushed these two patches.

Thanks a lot!

Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Avinash Kumar 2021-03-15 01:54:13 Re: Postgres crashes at memcopy() after upgrade to PG 13.
Previous Message Masahiro Ikeda 2021-03-15 01:39:06 Re: About to add WAL write/fsync statistics to pg_stat_wal view