Re: Logical replication keepalive flood

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: abbas(dot)butt(at)enterprisedb(dot)com
Cc: amit(dot)kapila16(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org, zahid(dot)iqbal(at)enterprisedb(dot)com
Subject: Re: Logical replication keepalive flood
Date: 2021-06-09 08:17:51
Message-ID: 20210609.171751.1579873424296912837.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Wed, 9 Jun 2021 11:21:55 +0900, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> wrote in
> The issue - if actually it is - we send a keep-alive packet before a
> quite short sleep.
>
> We really want to send it if the sleep gets long but we cannot predict
> that before entering a sleep.
>
> Let me think a little more on this..

After some investigation, I find out that the keepalives are sent
almost always after XLogSendLogical requests for the *next* record. In
most of the cases the record is not yet inserted at the request time
but insertd very soon (in 1-digit milliseconds). It doesn't seem to be
expected that that happens with such a high frequency when
XLogSendLogical is keeping up-to-date with the bleeding edge of WAL
records.

It is completely unpredictable when the next record comes, so we
cannot decide whether to send a keepalive or not at the current
timing.

Since we want to send a keepalive when we have nothing to send for a
while, it is a bit different to keep sending keepalives at some
intervals while the loop is busy.

As a possible solution, the attached patch splits the sleep into two
pieces. If the first sleep reaches the timeout then send a keepalive
then sleep for the remaining time. The first timeout is quite
arbitrary but keepalive of 4Hz at maximum doesn't look so bad to me.

Is it acceptable?

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
reduce_idle_time_keepalive_on_logrep_PoC1.patch text/x-patch 3.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-06-09 08:48:35 Re: locking [user] catalog tables vs 2pc vs logical rep
Previous Message tsunakawa.takay@fujitsu.com 2021-06-09 08:07:40 RE: Transactions involving multiple postgres foreign servers, take 2