Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication

From: shveta malik <shveta(dot)malik(at)gmail(dot)com>
To: Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com>
Cc: "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, shiy(dot)fnst(at)fujitsu(dot)com, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication
Date: 2023-02-02 11:31:09
Message-ID: CAJpy0uD-Kn0-rva5iFGXO6jWbhyBd6shyieSUuS7PODc6ySC6Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 1, 2023 at 5:37 PM Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com> wrote:
>
> Hi,
> Please see attached patches.
>
> Thanks,
> --
> Melih Mutlu
> Microsoft

Hi Melih,

Few suggestions on v10-0002-Reuse patch

1)
for (int64 i = 1; i <= lastusedid; i++)
{
char originname_to_drop[NAMEDATALEN] = {0};
snprintf(originname_to_drop,
sizeof(originname_to_drop), "pg_%u_%lld", subid, (long long) i);
.......
}

--Is it better to use the function
'ReplicationOriginNameForLogicalRep' here instead of sprintf, just to
be consistent everywhere to construct origin-name?

2)
pa_launch_parallel_worker:
launched = logicalrep_worker_launch(MyLogicalRepWorker->dbid,
MySubscription->oid,

MySubscription->name,

MyLogicalRepWorker->userid,
InvalidOid,

dsm_segment_handle(winfo->dsm_seg),
0);

--Can we please define 'InvalidRepSlotId' macro and pass it here as
the last arg to make it more readable.
#define InvalidRepSlotId 0
Same in ApplyLauncherMain where we are passing 0 as last arg to
logicalrep_worker_launch.

3)
We are safe to drop the replication trackin origin after this
--typo: trackin -->tracking

4)
process_syncing_tables_for_sync:
if (MyLogicalRepWorker->slot_name && strcmp(syncslotname,
MyLogicalRepWorker->slot_name) != 0)
{
..............
ReplicationOriginNameForLogicalRep(MyLogicalRepWorker->subid,

MyLogicalRepWorker->relid,
originname,

sizeof(originname));

/* Drop replication origin */
replorigin_drop_by_name(originname, true, false);
}

--Are we passing missing_ok as true (second arg) intentionally here in
replorigin_drop_by_name? Once we fix the issue reported in my earlier
email (ASSERT), do you think it makes sense to pass missing_ok as
false here?

5)
process_syncing_tables_for_sync:
foreach(lc, rstates)
{

rstate = (SubscriptionRelState *)
palloc(sizeof(SubscriptionRelState));
memcpy(rstate, lfirst(lc),
sizeof(SubscriptionRelState));
/*
* Pick the table for the next run if it is
not already picked up
* by another worker.
*/
LWLockAcquire(LogicalRepWorkerLock, LW_SHARED);
if (rstate->state != SUBREL_STATE_SYNCDONE &&

!logicalrep_worker_find(MySubscription->oid, rstate->relid, false))

{
.........
}
LWLockRelease(LogicalRepWorkerLock);
}

--Do we need to palloc for each relation separately? Shall we do it
once outside the loop and reuse it? Also pfree is not done for rstate
here.

6)
LogicalRepSyncTableStart:
1385 slotname = (char *) palloc(NAMEDATALEN);
1413 prev_slotname = (char *) palloc(NAMEDATALEN);
1481 slotname = prev_slotname;
1502 pfree(prev_slotname);
1512 UpdateSubscriptionRel(MyLogicalRepWorker->subid,
1513
MyLogicalRepWorker->relid,
1514
MyLogicalRepWorker->relstate,
1515
MyLogicalRepWorker->relstate_lsn,
1516 slotname,
1517 originname);

Can you please review the above flow (I have given line# along with),
I think it could be problematic. We alloced prev_slotname, assigned it
to slotname, freed prev_slotname and used slotname after freeing the
prev_slotname.
Also slotname is allocated some memory too, that is not freed.

Reviewing further....

JFYI, I get below while applying patch:

========================
shveta(at)shveta-vm:~/repos/postgres1/postgres$ git am
~/Desktop/shared/reuse/v10-0002-Reuse-Logical-Replication-Background-worker.patch
Applying: Reuse Logical Replication Background worker
.git/rebase-apply/patch:142: trailing whitespace.
values[Anum_pg_subscription_rel_srrelslotname - 1] =
.git/rebase-apply/patch:692: indent with spaces.
errmsg("could not receive list of slots associated
with the subscription %u, error: %s",
.git/rebase-apply/patch:1055: trailing whitespace.
/*
.git/rebase-apply/patch:1057: trailing whitespace.
* relations.
.git/rebase-apply/patch:1059: trailing whitespace.
* and origin. Then stop the worker gracefully.
warning: 5 lines add whitespace errors.
========================

thanks
Shveta

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dean Rasheed 2023-02-02 11:34:37 Re: run pgindent on a regular basis / scripted manner
Previous Message Amit Kapila 2023-02-02 11:21:14 Re: Deadlock between logrep apply worker and tablesync worker