Re: logical replication empty transactions

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Euler Taveira <euler(at)timbira(dot)com(dot)br>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: logical replication empty transactions
Date: 2020-03-02 11:26:40
Message-ID: CAA4eK1J-2XfzDWXHAd4g4=mN8igPyM-0WU5a6V1ZqiGB+JOdFA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 2, 2020 at 9:01 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Sat, Nov 9, 2019 at 7:29 AM Euler Taveira <euler(at)timbira(dot)com(dot)br> wrote:
> >
> > Em seg., 21 de out. de 2019 às 21:20, Jeff Janes
> > <jeff(dot)janes(at)gmail(dot)com> escreveu:
> > >
> > > After setting up logical replication of a slowly changing table using the built in pub/sub facility, I noticed way more network traffic than made sense. Looking into I see that every transaction in that database on the master gets sent to the replica. 99.999+% of them are empty transactions ('B' message and 'C' message with nothing in between) because the transactions don't touch any tables in the publication, only non-replicated tables. Is doing it this way necessary for some reason? Couldn't we hold the transmission of 'B' until something else comes along, and then if that next thing is 'C' drop both of them?
> > >
> > That is not optimal. Those empty transactions is a waste of bandwidth.
> > We can suppress them if no changes will be sent. test_decoding
> > implements "skip empty transaction" as you described above and I did
> > something similar to it. Patch is attached.
>
> I think this significantly reduces the network bandwidth for empty
> transactions. I have briefly reviewed the patch and it looks good to
> me.
>

One thing that is not clear to me is how will we advance restart_lsn
if we don't send any empty xact in a system where there are many such
xacts? IIRC, the restart_lsn is advanced based on confirmed_flush lsn
sent by subscriber. After this change, the subscriber won't be able
to send the confirmed_flush and for a long time, we won't be able to
advance restart_lsn. Is that correct, if so, why do we think that is
acceptable? One might argue that restart_lsn will be advanced as soon
as we send the first non-empty xact, but not sure if that is good
enough. What do you think?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2020-03-02 11:54:04 Re: Crash by targetted recovery
Previous Message Amit Kapila 2020-03-02 11:00:07 Re: [PATCH] Add schema and table names to partition error