Re: Why are wait events not reported even though it reads/writes a timeline history file?

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: Masahiro Ikeda <ikedamsh(at)oss(dot)nttdata(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Why are wait events not reported even though it reads/writes a timeline history file?
Date: 2020-05-01 03:04:56
Message-ID: 59474ea7-2480-428d-4850-4403254c1310@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2020/05/01 10:07, Masahiro Ikeda wrote:
> On 2020-05-01 00:25, Fujii Masao wrote:
>> On 2020/04/28 17:42, Masahiro Ikeda wrote:
>>> On 2020-04-28 15:09, Michael Paquier wrote:
>>>> On Tue, Apr 28, 2020 at 02:49:00PM +0900, Fujii Masao wrote:
>>>>> Isn't it safer to report the wait event during fgets() rather than putting
>>>>> those calls around the whole loop, like other code does? For example,
>>>>> writeTimeLineHistory() reports the wait event during read() rather than
>>>>> whole loop.
>>>>
>>>> Yeah, I/O wait events should be taken only during the duration of the
>>>> system calls.  Particularly here, you may finish with an elog() that
>>>> causes the wait event to be set longer than it should, leading to a
>>>> rather incorrect state if a snapshot of pg_stat_activity is taken.
>>>> --
>>>
>>> Thanks for your comments.
>>>
>>> I fixed it to report the wait event during fgets() only.
>>> Please review the v2 patch I attached.
>>
>> Thanks for updating the patch! Here are the review comments from me.
>>
>> +        char       *result;
>> +        pgstat_report_wait_start(WAIT_EVENT_TIMELINE_HISTORY_READ);
>> +        result = fgets(fline, sizeof(fline), fd);
>> +        pgstat_report_wait_end();
>> +        if (result == NULL)
>> +            break;
>> +
>>          /* skip leading whitespace and check for # comment */
>>          char       *ptr;
>>
>> Since the variable name "result" has been already used in this function,
>> it should be renamed.
>
> Sorry for that.
>
> I thought to rename it, but I changed to use feof()
> for clarify the difference from ferror().
>
>
>> The code should not be inject into the variable declaration block.
>
> Thanks for the comment.
> I moved the code block after the variable declaration block.
>
>
>> When reading this patch, I found that IO-error in fgets() has not
>> been checked there. Though this is not the issue that you reported,
>> but it seems better to fix it together. So what about adding
>> the following code?
>>
>>     if (ferror(fd))
>>         ereport(ERROR,
>>             (errcode_for_file_access(),
>>              errmsg("could not read file \"%s\": %m", path)));
>
> Thanks, I agree your comment.
> I added the above code to the v3 patch I attached.

Thanks for updating the patch! It looks good to me.

I applied cosmetic changes to the patch (attached). Barring any objection,
I will push this patch (also back-patch to v10 where wait-event for timeline
file was added).

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Attachment Content-Type Size
v4-0001-Fix-to-report-wait-events-about-timeline-history-.patch text/plain 1.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2020-05-01 08:45:59 Re: pg_rewind docs correction
Previous Message Fujii Masao 2020-05-01 02:49:51 Re: SLRU statistics