Re: pg_basebackup may fail to send feedbacks.

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_basebackup may fail to send feedbacks.
Date: 2015-02-13 18:01:22
Message-ID: CAHGQGwHZ_4yWyokA+5ks9s_698adjH8+0haMCSC9WYFh37GY7g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 10, 2015 at 7:48 PM, Kyotaro HORIGUCHI
<horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> Hello,
>
> The attached patch is fix for walreciever not using gettimeofday,
> and fix for receivelog using it.
>
>> > XLogWalRcvProcessMsg doesn't send feedback when processing
>> > 'w'=WAL record packet. So the same thing as pg_basebackup and
>> > pg_receivexlog will occur on walsender.
>> >
>> > Exiting the for(;;) loop by TimestampDifferenceExceeds just
>> > before line 439 is an equivalent measure to I poposed for
>> > receivelog.c, but calling XLogWalRcvSendHSFeedback(false) there
>> > is seemingly simpler (but I feel a bit uncomfortable for the
>> > latter)
>>
>> I'm concerned about the performance degradation by calling
>> getimeofday() per a record.
>>
>> > Or other measures?
>>
>> Introduce the maximum number of records that we can receive and
>> process between feedbacks? For example, we can change
>> XLogWalRcvSendHSFeedback() so that it's called per at least
>> 8 records. I'm not sure what number is good, though...
>
> At the beginning of the "while(len > 0){if (len > 0){" block,
> last_recv_timestamp is always kept up to date (by using
> gettimeotda():). So we can use the variable instead of
> gettimeofday() in my previous proposal.

Yes, but something like last_recv_timestamp doesn't exist in
REL9_1_STABLE. So you don't think that your proposed fix
should be back-patched to all supported versions. Right?

> The start time of the timeout could be last_recv_timestamp at the
> beginning of the while loop, since it is a bit earlier than the
> actual time at the point.

Sounds strange to me. I think that last_recv_timestamp should be
compared with the time when the last feedback message was sent,
e.g., maybe you can expose sendTime in XLogWalRcvSendReplay()
as global variable, and use it to compare with last_recv_timestamp.

When we get out of the WAL receive loop due to the timeout check
which your patch added, XLogWalRcvFlush() is always executed.
I don't think that current WAL file needs to be flushed at that time.

> The another solution would be using
> RegisterTimeout/enable_timeout_after and it seemed to be work for
> me. It can throw out the time counting stuff in the loop we are
> talking about and that of XLogWalRcvSendHSFeedback and
> XLogWalRcvSendReply, but it might be a bit too large for the
> gain.

Yes, sounds overkill.

Regards,

--
Fujii Masao

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2015-02-13 18:42:07 why does find_my_exec resolve symlinks?
Previous Message Heikki Linnakangas 2015-02-13 17:44:35 Re: Refactoring GUC unit conversions