From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Fabrice Chapuis <fabrice636861(at)gmail(dot)com>, Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com>, Petr Jelinek <petr(dot)jelinek(at)enterprisedb(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Ajin Cherian <itsajin(at)gmail(dot)com> |
Subject: | Re: Logical replication timeout problem |
Date: | 2022-03-29 05:07:17 |
Message-ID: | CAD21AoAo6x3rAQ7VuzPT9paA4Y7uuWPqdQn_XYk1bWpCMF_N5g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Mar 25, 2022 at 5:33 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Fri, Mar 25, 2022 at 11:49 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > On Fri, Mar 25, 2022 at 2:23 PM wangw(dot)fnst(at)fujitsu(dot)com
> > <wangw(dot)fnst(at)fujitsu(dot)com> wrote:
> >
> > Since commit 75b1521 added decoding of sequence to logical
> > replication, the patch needs to have pgoutput_sequence() call
> > update_progress().
> >
>
> Yeah, I also think this needs to be addressed. But apart from this, I
> want to know your and other's opinion on the following two points:
> a. Both this and the patch discussed in the nearby thread [1] add an
> additional parameter to
> WalSndUpdateProgress/OutputPluginUpdateProgress and it seems to me
> that both are required. The additional parameter 'last_write' added by
> this patch indicates: "If the last write is skipped then try (if we
> are close to wal_sender_timeout) to send a keepalive message to the
> receiver to avoid timeouts.". This means it can be used after any
> 'write' message. OTOH, the parameter 'skipped_xact' added by another
> patch [1] indicates if we have skipped sending anything for a
> transaction then sendkeepalive for synchronous replication to avoid
> any delays in such a transaction. Does this sound reasonable or can
> you think of a better way to deal with it?
These current approaches look good to me.
> b. Do we want to backpatch the patch in this thread? I am reluctant to
> backpatch because it changes the exposed API which can have an impact
> and second there exists a workaround (user can increase
> wal_sender_timeout/wal_receiver_timeout).
Yeah, we should avoid API changes between minor versions. I feel it's
better to fix it also for back-branches but probably we need another
fix for them. The issue reported on this thread seems quite
confusable; it looks like a network problem but is not true. Also, the
user who faced this issue has to increase wal_sender_timeout due to
the decoded data size, which also means to delay detecting network
problems. It seems an unrelated trade-off.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2022-03-29 05:13:00 | Re: Skipping logical replication transactions on subscriber side |
Previous Message | Michael Paquier | 2022-03-29 05:05:56 | Re: Add pg_freespacemap extension sql test |