Re: Adding a LogicalRepWorker type field

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Adding a LogicalRepWorker type field
Date: 2023-07-31 09:55:04
Message-ID: CALj2ACUCJcmnHj689oC5JyNRzEQUe53xbxuU_CQ4NmVdiazdNw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jul 31, 2023 at 7:17 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> PROBLEM:
>
> IMO, deducing the worker's type by examining multiple different field
> values seems a dubious way to do it. This maybe was reasonable enough
> when there were only 2 types, but as more get added it becomes
> increasingly complicated.

+1 for being more explicit in worker types. I also think that we must
move away from am_{parallel_apply, tablesync,
leader_apply}_worker(void) to Is{ParallelApply, TableSync,
LeaderApply}Worker(MyLogicalRepWorker), just to be a little more
consistent and less confusion around different logical replication
worker type related functions.

> AFAIK none of the complications is necessary anyway; the worker type
> is already known at launch time, and it never changes during the life
> of the process.
>
> The attached Patch 0001 introduces a new enum 'type' field, which is
> assigned during the launch.
>
> This change not only simplifies the code, but it also permits other
> code optimizations, because now we can switch on the worker enum type,
> instead of using cascading if/else. (see Patch 0002).

Some comments:
1.
+/* Different kinds of workers */
+typedef enum LogicalRepWorkerType
+{
+ LR_WORKER_TABLESYNC,
+ LR_WORKER_APPLY,
+ LR_WORKER_APPLY_PARALLEL
+} LogicalRepWorkerType;

Can these names be readable? How about something like
LR_TABLESYNC_WORKER, LR_APPLY_WORKER, LR_PARALLEL_APPLY_WORKER?

2.
-#define isParallelApplyWorker(worker) ((worker)->leader_pid != InvalidPid)
+#define isLeaderApplyWorker(worker) ((worker)->type == LR_WORKER_APPLY)
+#define isParallelApplyWorker(worker) ((worker)->type ==
LR_WORKER_APPLY_PARALLEL)
+#define isTablesyncWorker(worker) ((worker)->type == LR_WORKER_TABLESYNC)

Can the above start with "Is" instead of "is" similar to
IsLogicalWorker and IsLogicalParallelApplyWorker?

3.
+ worker->type =
+ OidIsValid(relid) ? LR_WORKER_TABLESYNC :
+ is_parallel_apply_worker ? LR_WORKER_APPLY_PARALLEL :
+ LR_WORKER_APPLY;

Perhaps, an if-else is better for readability?

if (OidIsValid(relid))
worker->type = LR_WORKER_TABLESYNC;
else if (is_parallel_apply_worker)
worker->type = LR_WORKER_APPLY_PARALLEL;
else
worker->type = LR_WORKER_APPLY;

4.
+/* Different kinds of workers */
+typedef enum LogicalRepWorkerType
+{
+ LR_WORKER_TABLESYNC,
+ LR_WORKER_APPLY,
+ LR_WORKER_APPLY_PARALLEL
+} LogicalRepWorkerType;

Have a LR_WORKER_UNKNOWN = 0 and set it in logicalrep_worker_cleanup()?

5.
+ case LR_WORKER_APPLY:
+ return (rel->state == SUBREL_STATE_READY ||
+ (rel->state == SUBREL_STATE_SYNCDONE &&
+ rel->statelsn <= remote_final_lsn));

Not necessary, but a good idea to have a default: clause and error out
saying wrong logical replication worker type?

6.
+ case LR_WORKER_APPLY_PARALLEL:
+ /*
+ * Skip for parallel apply workers because they only
operate on tables
+ * that are in a READY state. See pa_can_start() and
+ * should_apply_changes_for_rel().
+ */
+ break;

I'd better keep this if-else as-is instead of a switch case with
nothing for parallel apply worker.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2023-07-31 10:05:02 Missing comments/docs about custom scan path
Previous Message John Naylor 2023-07-31 09:39:04 Re: Performance degradation on concurrent COPY into a single relation in PG16.