Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication

From: Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com>
To: shveta malik <shveta(dot)malik(at)gmail(dot)com>
Cc: "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, shiy(dot)fnst(at)fujitsu(dot)com, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication
Date: 2023-02-22 12:56:00
Message-ID: CAGPVpCQmEE8BygXr=Hi2N2t2kOE=PJwofn9TX0J9J4crjoXarQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Shveta,

Thanks for reviewing.
Please see attached patches.

shveta malik <shveta(dot)malik(at)gmail(dot)com>, 2 Şub 2023 Per, 14:31 tarihinde şunu
yazdı:

> On Wed, Feb 1, 2023 at 5:37 PM Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com> wrote:
> for (int64 i = 1; i <= lastusedid; i++)
> {
> char originname_to_drop[NAMEDATALEN] = {0};
> snprintf(originname_to_drop,
> sizeof(originname_to_drop), "pg_%u_%lld", subid, (long long) i);
> .......
> }
>
> --Is it better to use the function
> 'ReplicationOriginNameForLogicalRep' here instead of sprintf, just to
> be consistent everywhere to construct origin-name?
>

ReplicationOriginNameForLogicalRep creates a slot name with current
"lastusedid" and doesn't accept that id as parameter. Here the patch needs
to check all possible id's.

> /* Drop replication origin */
> replorigin_drop_by_name(originname, true, false);
> }
>
> --Are we passing missing_ok as true (second arg) intentionally here in
> replorigin_drop_by_name? Once we fix the issue reported in my earlier
> email (ASSERT), do you think it makes sense to pass missing_ok as
> false here?
>

Yes, missing_ok is intentional. The user might be concurrently refreshing
the sub or the apply worker might already drop the origin at that point.
So, missing_ok is set to true.
This is also how origin drops before the worker exits are handled on HEAD
too. I only followed the same approach.

> --Do we need to palloc for each relation separately? Shall we do it
> once outside the loop and reuse it? Also pfree is not done for rstate
> here.
>

Removed palloc from the loop. No need to pfree here since the memory
context will be deleted with the next CommitTransactionCommand call.

> Can you please review the above flow (I have given line# along with),
> I think it could be problematic. We alloced prev_slotname, assigned it
> to slotname, freed prev_slotname and used slotname after freeing the
> prev_slotname.
> Also slotname is allocated some memory too, that is not freed.
>

Right, I used memcpy instead of assigning prev_slotname to slotname.
slotname is returned in the end and pfree'd later [1]

I also addressed your other reviews that I didn't explicitly mention in
this email. I simply applied the changes you pointed out. Also added some
more logs as well. I hope it's more useful now.

[1]
https://github.com/postgres/postgres/blob/master/src/backend/replication/logical/worker.c#L4359

Thanks,
--
Melih Mutlu
Microsoft

Attachment Content-Type Size
v8-0001-Add-replication-protocol-cmd-to-create-a-snapsho.patch application/octet-stream 20.9 KB
v11-0002-Reuse-Logical-Replication-Background-worker.patch application/octet-stream 74.1 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2023-02-22 12:56:32 meson: Add equivalent of configure --disable-rpath option
Previous Message Melih Mutlu 2023-02-22 12:51:35 Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication