Re: Minimal logical decoding on standbys

From: "Drouvot, Bertrand" <bdrouvot(at)amazon(dot)com>
To: <fabriziomello(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>, tushar <tushar(dot)ahuja(at)enterprisedb(dot)com>, "[pgdg] Robert Haas" <robertmhaas(at)gmail(dot)com>, Rahila Syed <rahila(dot)syed(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Minimal logical decoding on standbys
Date: 2021-09-20 10:17:11
Message-ID: bff0a5b6-178c-1b64-b660-5ec59b1046bb@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 9/17/21 10:32 PM, Fabrízio de Royes Mello wrote:
>
> On Wed, Sep 15, 2021 at 8:36 AM Drouvot, Bertrand <bdrouvot(at)amazon(dot)com
> <mailto:bdrouvot(at)amazon(dot)com>> wrote:
> >
> > Another rebase attached.
> >
> > The patch proposal to address Andre's walsender corner cases is
> still a dedicated commit (as i think it may be easier to discuss).
> >
>
> Did one more battery of tests and everything went well...

Thanks for looking at it!

>
> But doing some manually tests:
>
> 1. Setup master/replica (wal_level=logical, hot_standby_feedback=on, etc)
> 2. Initialize the master instance: "pgbench -i -s10 on master"
> 3. Terminal1: execute "pgbench -c20 -T 2000"
> 4. Terminal2: create the logical replication slot:
>
> 271480 (replica) fabrizio=# select * from
> pg_create_logical_replication_slot('test_logical', 'test_decoding');
> -[ RECORD 1 ]-----------
> slot_name | test_logical
> lsn       | 1/C7C59E0
>
> Time: 37658.725 ms (00:37.659)
>
>
> Even with activity on primary the creation of the logical replication
> slot took ~38s. Can we do something related to it or should we need to
> clarify even more the documentation?
>
For the logical slot creation on the standby, as we can not do WAL
writes, we have to wait for xl_running_xact to be logged on the primary
and be replayed on the standby.

So we are somehow dependent on the checkpoints on the primary and
LOG_SNAPSHOT_INTERVAL_MS.

If we want to get rid of this, what i could think of is the standby
having to ask the primary to log a standby snapshot (until we get one we
are happy with).

Or, we may just want to mention in the doc:

+     For a logical slot to be created, it builds a historic snapshot,
for which
+     information of all the currently running transactions is essential. On
+     primary, this information is available, but on standby, this
information
+     has to be obtained from primary. So, creating a logical slot on
standby
+     may take a noticeable time.

Instead of:

+     For a logical slot to be created, it builds a historic snapshot,
for which
+     information of all the currently running transactions is essential. On
+     primary, this information is available, but on standby, this
information
+     has to be obtained from primary. So, slot creation may wait for some
+     activity to happen on the primary. If the primary is idle, creating a
+     logical slot on standby may take a noticeable time.

What do you think?

Thanks

Bertrand

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-09-20 10:26:19 Re: Added schema level support for publication.
Previous Message Antonin Houska 2021-09-20 09:53:57 Re: [PATCH] Full support for index LP_DEAD hint bits on standby