Re: logical replication empty transactions

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>
Cc: Euler Taveira <euler(at)timbira(dot)com(dot)br>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: logical replication empty transactions
Date: 2020-03-03 10:04:01
Message-ID: CAA4eK1KnPYE9tsa-E+KVj7HFaZOTSexSqqkODqg7ARmXyrc-9A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 3, 2020 at 2:17 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Tue, Mar 3, 2020 at 1:54 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Tue, Mar 3, 2020 at 9:35 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > >
> > > On Mon, Mar 2, 2020 at 4:56 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > >
> > > >
> > > > One thing that is not clear to me is how will we advance restart_lsn
> > > > if we don't send any empty xact in a system where there are many such
> > > > xacts? IIRC, the restart_lsn is advanced based on confirmed_flush lsn
> > > > sent by subscriber. After this change, the subscriber won't be able
> > > > to send the confirmed_flush and for a long time, we won't be able to
> > > > advance restart_lsn. Is that correct, if so, why do we think that is
> > > > acceptable? One might argue that restart_lsn will be advanced as soon
> > > > as we send the first non-empty xact, but not sure if that is good
> > > > enough. What do you think?
> > >
> > > It seems like a valid point. One idea could be that we can track the
> > > last commit LSN which we streamed and if the confirmed flush location
> > > is already greater than that then even if we skip the sending the
> > > commit message we can increase the confirm flush location locally.
> > > Logically, it should not cause any problem because once we have got
> > > the confirmation for whatever we have streamed so far. So for other
> > > commits(which we are skipping), we can we advance it locally because
> > > we are sure that we don't have any streamed commit which is not yet
> > > confirmed by the subscriber.
> > >
> >
> > Will this work after restart? Do you want to persist the information
> > of last streamed commit LSN?
>
> We will not persist the last streamed commit LSN, this variable is in
> memory just to track whether we have got confirmation up to that
> location or not, once we have confirmation up to that location and if
> we are not streaming any transaction (because those are empty
> transactions) then we can just advance the confirmed flush location
> and based on that we can update the restart point as well and those
> will be persisted. Basically, "last streamed commit LSN" is just a
> marker that their still something pending to be confirmed from the
> subscriber so until that we can not simply advance the confirm flush
> location or restart point based on the empty transactions. But, if
> there is nothing pending to be confirmed we can advance. So if we are
> streaming then we will get confirmation from subscriber otherwise we
> can advance it locally. So, in either case, the confirmed flush
> location and restart point will keep moving.
>

Okay, so this might work out, but it might look a bit ad-hoc.

> >
> > > This is just my thought, but if we
> > > think from the code and design perspective then it might complicate
> > > the things and sounds hackish.
> > >
> >
> > Another idea could be that we stream the transaction after some
> > threshold number (say 100 or anything we think is reasonable) of empty
> > xacts. This will reduce the traffic without tinkering with the core
> > design too much.
>
> Yeah, this could be also an option.
>

Okay.

Peter E, Petr J, others, do you have any opinion on what is the best
way forward for this thread? I think it would be really good if we
can reduce the network traffic due to these empty transactions.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Kalvin Eng 2020-03-03 10:07:13 [GSoC 2020] Questions About Performance Farm Benchmarks and Website
Previous Message Michael Paquier 2020-03-03 09:25:51 Re: reindex concurrently and two toast indexes