Re: Logical replication timeout problem

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Euler Taveira <euler(at)eulerto(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Fabrice Chapuis <fabrice636861(at)gmail(dot)com>, Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com>, Petr Jelinek <petr(dot)jelinek(at)enterprisedb(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Ajin Cherian <itsajin(at)gmail(dot)com>
Subject: Re: Logical replication timeout problem
Date: 2022-04-01 02:27:46
Message-ID: CAA4eK1+wSvfDVUSe-4o6APFdO8=9G-HnZJ+01_LQOxqLmhqe5Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Apr 1, 2022 at 7:33 AM Euler Taveira <euler(at)eulerto(dot)com> wrote:
>
> On Thu, Mar 31, 2022, at 9:24 AM, Masahiko Sawada wrote:
>
> On the other hand, possible another solution would be to add a new
> callback that is called e.g., every 1000 changes so that walsender
> does its job such as timeout handling while processing the decoded
> data in reorderbuffer.c. The callback is set only if the walsender
> does logical decoding, otherwise NULL. With this idea, other plugins
> will also be able to benefit without changes. But I’m not really sure
> it’s a good design, and adding a new callback introduces complexity.
>
> No new callback is required.
>
> In the current code, each output plugin callback is responsible to call
> OutputPluginUpdateProgress. It is up to the output plugin author to add calls
> to this function. The lack of a call in a callback might cause issues like what
> was described in the initial message.
>

This is exactly our initial analysis and we have tried a patch on
these lines and it has a noticeable overhead. See [1]. Calling this
for each change or each skipped change can bring noticeable overhead
that is why we decided to call it after a certain threshold (100) of
skipped changes. Now, surely as mentioned in my previous reply we can
make it generic such that instead of calling this (update_progress
function as in the patch) for skipped cases, we call it always. Will
that make it better?

> The functions CreateInitDecodingContext and CreateDecodingContext receives the
> update_progress function as a parameter. These functions are called in 2
> places: (a) streaming replication protocol (CREATE_REPLICATION_SLOT) and (b)
> SQL logical decoding functions (pg_logical_*_changes). Case (a) uses
> WalSndUpdateProgress as a progress function. Case (b) does not have one because
> it is not required -- local decoding/communication. There is no custom update
> progress routine for each output plugin which leads me to the question:
> couldn't we encapsulate the update progress call into the callback functions?
>

Sorry, I don't get your point. What exactly do you mean by this?
AFAIS, currently we call this output plugin API in pgoutput functions
only, do you intend to get it invoked from a different place?

[1] - https://www.postgresql.org/message-id/OS3PR01MB6275DFFDAC7A59FA148931529E209%40OS3PR01MB6275.jpnprd01.prod.outlook.com

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David G. Johnston 2022-04-01 02:31:29 Re: unlogged sequences
Previous Message Peter Geoghegan 2022-04-01 02:05:12 Re: should vacuum's first heap pass be read-only?