Quick Links

Re: Movement of restart_lsn position movement of logical replication slots is very slow

From:	Jammie <shailesh(dot)jamloki(at)gmail(dot)com>
To:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc:	PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: Movement of restart_lsn position movement of logical replication slots is very slow
Date:	2020-12-24 14:00:30
Message-ID:	CAFt1pcp=WwaqOqEPq4pie+_SDxdM2wZS6Aoi+kg1h8_OXhL8fQ@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Sorry dont have the debug setup handy. However the sql commands now works
though to move the restart_lsn of the slots in standlone code from psql.

A few followup questions.

What is catalog_xmin in the pg_replication_slots ? and how is it playing
role in moving the restart_lsn of the slot.

I am just checking possibility that if a special transaction can cause
private slot to stale ?

I do see that in the private slot catalog_xmin also stuck along with
restart_lsn. Though from JDBC code confirmed_flush_lsn is updated correctly
in the pg_replication_slots;

Regards
Shailesh

On Thu, Dec 24, 2020 at 12:29 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
wrote:

> On Wed, Dec 23, 2020 at 7:06 PM Jammie <shailesh(dot)jamloki(at)gmail(dot)com> wrote:
> >
> > Thanks Amit for the response.
> > Two things :
> > 1) In our observation via PSQL the advance command as well do not move
> the restart_lsn immediately. It is similar to our approach that use the
> confirmed_flush_lsn via stream
> > 2) I am ok to understand the point that we are not reading from the
> stream so we might be facing the issue. But the question is why we are able
> to move the restart_lsn most of the time by updating the
> confirmed_flush_lsn via pgJDBC. But only occasionally it lags behind too
> far behind.
> >
>
> I am not sure why you are seeing such behavior. Is it possible for you
> to debug the code? Both confirmed_flush_lsn and restart_lsn are
> advanced in LogicalConfirmReceivedLocation. You can add elog to print
> the values to see the progress. Here, the point to note is that even
> though we update confirmed_flush_lsn every time with the new value but
> restart_lsn is updated only when candidate_restart_valid has a valid
> value each time after a call to LogicalConfirmReceivedLocation. We
> update candidate_restart_valid in
> LogicalIncreaseRestartDecodingForSlot which is called only during
> decoding of XLOG_RUNNING_XACTS record. So, it is not clear to me how
> in your case restart_lsn is getting advanced without decode? I think
> if you add some elogs in the code to track the values of
> candidate_restart_valid, confirmed_flush_lsn, and restart_lsn, you
> might get some clue.
>
> --
> With Regards,
> Amit Kapila.
>

In response to

Re: Movement of restart_lsn position movement of logical replication slots is very slow at 2020-12-24 07:00:57 from Amit Kapila

Responses

Re: Movement of restart_lsn position movement of logical replication slots is very slow at 2020-12-24 14:15:23 from Amit Kapila

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Amit Kapila	2020-12-24 14:07:17	Re: [Patch] Optimize dropping of relation buffers using dlist
Previous Message	k.jamison@fujitsu.com	2020-12-24 13:29:53	RE: [Patch] Optimize dropping of relation buffers using dlist