Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication

From: Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com>
To: shveta malik <shveta(dot)malik(at)gmail(dot)com>
Cc: "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication
Date: 2023-02-01 11:35:44
Message-ID: CAGPVpCStDZj7mNchcBE8QKsNZdQCuHXAqX7_J-m3cgvaeoGFbw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Please see attached patches for the below changes.

shveta malik <shveta(dot)malik(at)gmail(dot)com>, 27 Oca 2023 Cum, 13:12 tarihinde
şunu yazdı:

> On Thu, Jan 26, 2023 at 7:53 PM Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com>
> wrote:
> 1.
> --Can we please add a few more points to the Summary to make it more clear.
> a) something telling that reusability of workers is for tables under
> one subscription and not across multiple subscriptions.
> b) Since we are reusing both workers and slots, can we add:
> --when do we actually end up spawning a new worker
> --when do we actually end up creating a new slot in a worker rather
> than using existing one.
> --if we reuse existing slots, what happens to the snapshot?
>

I modified the commit message if that's what you mean by the Summary.

> 2.
> + The last used ID for tablesync workers. This ID is used to
> + create replication slots. The last used ID needs to be stored
> + to make logical replication can safely proceed after any
> interruption.
> + If sublastusedid is 0, then no table has been synced yet.
>
> --typo:
> to make logical replication can safely proceed ==> to make logical
> replication safely proceed
>

Done

> 3.
> is_first_run;
> move_to_next_rel;
> --Do you think one variable is enough here as we do not get any extra
> info by using 2 variables? Can we have 1 which is more generic like
> 'ready_to_reuse'. Otherwise, please let me know if we must use 2.
>

Right. Removed is_first_run and renamed move_to_next_rel as ready_to_reuse.

> 4.
> /* missin_ok = true, since the origin might be already dropped. */
> typo: missing_ok
>

Done.

> 5. GetReplicationSlotNamesBySubId:
> errmsg("not tuple returned."));
>
> Can we have a better error msg:
> ereport(ERROR,
> errmsg("could not receive list of slots
> associated with subscription %d, error: %s", subid, res->err));
>

Done.

> 6.
> static void
> clean_sync_worker(void)
>
> --can we please add introductory comment for this function.
>

Done.

> 7.
> /*
> * Pick the table for the next run if there is not another worker
> * already picked that table.
> */
> Pick the table for the next run if it is not already picked up by
> another worker.
>

Done.

> 8.
> process_syncing_tables_for_sync()
>
> /* Cleanup before next run or ending the worker. */
> --can we please improve this comment:
> if there is no more work left for this worker, stop the worker
> gracefully, else do clean-up and make it ready for the next
> relation/run.
>

Done

> 9.
> LogicalRepSyncTableStart:
> * Read previous slot name from the catalog, if exists.
> */
> prev_slotname = (char *) palloc0(NAMEDATALEN);
> Do we need to free this at the end?
>

Pfree'd prev_slotname after we're done with it.

> 10.
> if (strlen(prev_slotname) == 0)
> {
> elog(ERROR, "Replication slot could not be
> found for relation %u",
> MyLogicalRepWorker->relid);
> }
> shall we mention subid also in error msg.
>

Done.

Thanks for reviewing,
--
Melih Mutlu
Microsoft

Attachment Content-Type Size
v5-0001-Add-replication-protocol-cmd-to-create-a-snapshot.patch application/octet-stream 20.9 KB
v9-0002-Reuse-Logical-Replication-Background-worker.patch application/octet-stream 72.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2023-02-01 11:40:45 Re: Can we do something to help stop users mistakenly using force_parallel_mode?
Previous Message David Rowley 2023-02-01 11:21:20 Re: heapgettup refactoring