Re: Synchronizing slots from primary to standby

From: shveta malik <shveta(dot)malik(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Ajin Cherian <itsajin(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: Synchronizing slots from primary to standby
Date: 2023-09-06 08:48:52
Message-ID: CAJpy0uAYCSx80tzaQB8E3Y8R8Fmu29p4gb1fPJ5hbUsKB9_2AQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 1, 2023 at 1:59 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> Hi Shveta. Here are some comments for patch v14-0002
>
> The patch is large, so my code review is a WIP... more later next week...
>

Thanks Peter for the feedback. I have tried to address most of these
in v15. Please find my response inline for the ones which I have not
addressed.

> ======
> GENERAL
>
> 1. Patch size
>
> The patch is 2700 lines. Is it possible to break this up into smaller
> self-contained parts to make the reviews more manageable?
>

Currently, patches are created based on work done on primary and
standby. Patch 001 for primary-side implementation and 002 for standby
side. Let me think more on this and see if the changes can be
segregated further.

>
> 26.
> +typedef struct SlotSyncWorker
> +{
> + /* Time at which this worker was launched. */
> + TimestampTz launch_time;
> +
> + /* Indicates if this slot is used or free. */
> + bool in_use;
> +
> + /* The slot in worker pool to which it is attached */
> + int slot;
> +
> + /* Increased every time the slot is taken by new worker. */
> + uint16 generation;
> +
> + /* Pointer to proc array. NULL if not running. */
> + PGPROC *proc;
> +
> + /* User to use for connection (will be same as owner of subscription). */
> + Oid userid;
> +
> + /* Database id to connect to. */
> + Oid dbid;
> +
> + /* Count of Database ids it manages */
> + uint32 dbcount;
> +
> + /* DSA for dbids */
> + dsa_area *dbids_dsa;
> +
> + /* dsa_pointer for database ids it manages */
> + dsa_pointer dbids_dp;
> +
> + /* Mutex to access dbids in dsa */
> + slock_t mutex;
> +
> + /* Info about slot being monitored for worker's naptime purpose */
> + SlotSyncWorkerWatchSlot monitor;
> +} SlotSyncWorker;
>
> There seems an awful lot about this struct which is common with
> 'LogicalRepWorker' struct.
>
> It seems a shame not to make use of the commonality instead of all the
> cut/paste here.
>
> E.g. Can it be rearranged so all these common fields are shared:
> - launch_time
> - in_use
> - slot
> - generation
> - proc
> - userid
> - dbid
>
> ======
> src/include/storage/lwlock.h

Sure, I had this in mind along with previous comments where it was
suggested to merge similar functions like
WaitForReplicationWorkerAttach, WaitForSlotSyncWorkerAttach etc. That
merging could only be possible if we try to merge the common part of
these structures. This is WIP, will be addressed in the next version.

>
> 27.
> LWTRANCHE_LAUNCHER_HASH,
> - LWTRANCHE_FIRST_USER_DEFINED
> + LWTRANCHE_FIRST_USER_DEFINED,
> + LWTRANCHE_SLOT_SYNC_DSA
> } BuiltinTrancheIds;
>
> Isn't 'LWTRANCHE_FIRST_USER_DEFINED' supposed to the be last enum?
>
> ======
> src/test/recovery/meson.build
>
> ======
> src/test/recovery/t/051_slot_sync.pl
>

I have currently removed this file from the patch. Please see my
comments (pt 3) here:
https://mail.google.com/mail/u/0/?ik=52d5778aba&view=om&permmsgid=msg-a:r-2984462571505788980

thanks
Shveta

> 28.
> +
> +# Copyright (c) 2021, PostgreSQL Global Development Group
> +
> +use strict;
>
> Wrong copyright date
>
> ~~~
>
> 29.
> +my $node_primary = PostgreSQL::Test::Cluster->new('primary');
> +my $node_phys_standby = PostgreSQL::Test::Cluster->new('phys_standby');
> +my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
>
> 29a.
> Can't all the subroutines be up-front? Then this can move to be with
> the other node initialisation code that comets next.
>
> ~
>
> 29b.
> Add a comment something like # Setup nodes
>
> ~~~
>
> 30.
> +# Check conflicting status in pg_replication_slots.
> +sub check_slots_conflicting_status
> +{
> + my $res = $node_phys_standby->safe_psql(
> + 'postgres', qq(
> + select bool_and(conflicting) from pg_replication_slots;));
> +
> + is($res, 't',
> + "Logical slot is reported as conflicting");
> +}
>
> Doesn't bool_and() mean returns false if only some but not all slots
> are conflicting - is that intentional?> Or is this sub-routine only
> expecting to test one slot, in which case maybe the SQL should include
> also the 'slot_name'?
>
> ~~~
>
> 31.
> +$node_primary->start;
> +$node_primary->psql('postgres', q{SELECT
> pg_create_physical_replication_slot('pslot1');});
> +
> +$node_primary->backup('backup');
> +
> +$node_phys_standby->init_from_backup($node_primary, 'backup',
> has_streaming => 1);
> +$node_phys_standby->append_conf('postgresql.conf', q{
> +synchronize_slot_names = '*'
> +primary_slot_name = 'pslot1'
> +hot_standby_feedback = off
> +});
> +$node_phys_standby->start;
> +
> +$node_primary->safe_psql('postgres', "CREATE TABLE t1 (a int PRIMARY KEY)");
> +$node_primary->safe_psql('postgres', "INSERT INTO t1 VALUES (1), (2), (3)");
>
> The comments seem mostly to describe details about what are the
> expectations at each test step.
>
> IMO there also needs to be a larger "overview" comment to describe
> more generally *what* this is testing, and *how* it is testing it.
> e.g. it is hard to understand the test without being already familiar
> with the patch.
>
> ------
> Kind Regards,
> Peter Smith.
> Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2023-09-06 08:50:46 Re: GUC for temporarily disabling event triggers
Previous Message shveta malik 2023-09-06 08:37:20 Re: Synchronizing slots from primary to standby