Re: pg_basebackup may fail to send feedbacks.

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: masao(dot)fujii(at)gmail(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: pg_basebackup may fail to send feedbacks.
Date: 2015-02-20 08:29:14
Message-ID: 20150220.172914.241732690.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

At Thu, 19 Feb 2015 19:22:21 +0900, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote in <CAHGQGwGLFLaFrCYcuikkVefNaoEL448TLSJ9oPsvb17v3foZHA(at)mail(dot)gmail(dot)com>
> On Wed, Feb 18, 2015 at 5:34 PM, Kyotaro HORIGUCHI
> <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> > Hello, this is the last patch for pg_basebackup/pg_receivexlog on
> > master (9.5). Preor versions don't have this issue.
> >
> > 4. basebackup_reply_fix_mst_v2.patch
> > receivelog.c patch applyable on master.
> >
> > This is based on the same design with
> > walrcv_reply_fix_91_v2.patch in the aspect of gettimeofday().
>
> Thanks for updating the patches! But I'm still not sure if the idea depending
> on the frequent calls of gettimeofday() for each WAL receive is good or not.

Neither do I. Nowadays, linux on AMD64/x64 environment has no
problem even if gettimeofday() called frequently, but Windows
seems to have a problem and I don't know about other platforms.

One possible timing source is LSN.

> if ((blockpos - last_blockpos) / BLKSZ > 0)
> {
> now = feGetCurrentTimestamp();
> if (feTimestampDifferenceExceeds(last_status, now,
..
> if (!sendFeedback(conn, blockpos, now, false))
> }
> }
>
> last_blockpos = blockpos;

But once per PAGESZ can easily be more frequent than once per 10
records and XLOG_SEG_SIZE seems too big. However I don't see any
bases to determine the frequency between them nor other than the
time itself.

SIGALRM seems to me to be more preferable to keep the main jobe
as fast as possible than introducing a code with no reasonable
basis.

> Some users may complain about the performance impact by such frequent calls
> and we may want to get rid of them from walreceiver loop in the future.
> If we adopt your idea now, I'm afraid that it would tie our hands in that case.
>
> How much impact can such frequent calls of gettimeofday() have on replication
> performance? If it's not negligible, probably we should remove them at first
> and find out another idea to fix the problem you pointed. ISTM that it's not so
> difficult to remove them. Thought? Do you have any numbers which can prove
> that such frequent gettimeofday() has only ignorable impact on the performance?

The attached patch is 'the more sober' version of SIGLARM patch.

I'll search for the another way after this.

regards,

Attachment Content-Type Size
basebackup_reply_fix_mst_v3_SIGALRM.patch text/x-patch 5.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexey Klyukin 2015-02-20 09:15:56 Re: Report search_path value back to the client.
Previous Message Michael Paquier 2015-02-20 07:59:19 Re: Expanding the use of FLEXIBLE_ARRAY_MEMBER for declarations like foo[1]