Re: Perform streaming logical transactions by background workers and parallel apply

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject: Re: Perform streaming logical transactions by background workers and parallel apply
Date: 2023-01-16 04:54:00
Message-ID: CAHut+PvFybTRJp24DdtcHF2QoG0OTfGsypa++drUW+yntPccjQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Here are some review comments for v81-0001.

======

Commit Message

1.

Additionally, update the leader_pid column in pg_stat_activity as well to
display the PID of the leader apply worker for parallel apply workers.

~

Probably it should not say both "Additionally" and "as well" in the
same sentence.

======

src/backend/replication/logical/launcher.c

2.

/*
+ * Return the pid of the leader apply worker if the given pid is the pid of a
+ * parallel apply worker, otherwise return InvalidPid.
+ */
+pid_t
+GetLeaderApplyWorkerPid(pid_t pid)
+{
+ int leader_pid = InvalidPid;
+ int i;
+
+ LWLockAcquire(LogicalRepWorkerLock, LW_SHARED);
+
+ for (i = 0; i < max_logical_replication_workers; i++)
+ {
+ LogicalRepWorker *w = &LogicalRepCtx->workers[i];
+
+ if (isParallelApplyWorker(w) && w->proc && pid == w->proc->pid)
+ {
+ leader_pid = w->leader_pid;
+ break;
+ }
+ }
+
+ LWLockRelease(LogicalRepWorkerLock);
+
+ return leader_pid;
+}

2a.
IIUC the IsParallelApplyWorker macro does nothing except check that
the leader_pid is not InvalidPid anyway, so AFAIK this algorithm does
not benefit from using this macro because we will want to return
InvalidPid anyway if the given pid matches.

So the inner condition can just say:

if (w->proc && w->proc->pid == pid)
{
leader_pid = w->leader_pid;
break;
}

~

2b.
A possible alternative comment.

BEFORE
Return the pid of the leader apply worker if the given pid is the pid
of a parallel apply worker, otherwise return InvalidPid.

AFTER
If the given pid has a leader apply worker then return the leader pid,
otherwise, return InvalidPid.

======

src/backend/utils/adt/pgstatfuncs.c

3.

@@ -434,6 +435,16 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)
values[28] = Int32GetDatum(leader->pid);
nulls[28] = false;
}
+ else
+ {
+ int leader_pid = GetLeaderApplyWorkerPid(beentry->st_procpid);
+
+ if (leader_pid != InvalidPid)
+ {
+ values[28] = Int32GetDatum(leader_pid);
+ nulls[28] = false;
+ }
+

3a.
There is an existing comment preceding this if/else but it refers only
to leaders of parallel groups. Should that comment be updated to
mention the leader apply worker too?

~

3b.
It may be unrelated to this patch, but it seems strange to me that the
nulls[28]/values[28] assignments are done where they are. Every other
nulls/values assignment of this function here is pretty much in the
correct numerical order except this one, so IMO this code ought to be
relocated to later in this same function.

------
Kind Regards,
Peter Smith.
Fujitsu Australia.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2023-01-16 05:01:57 Re: [PoC] Improve dead tuple storage for lazy vacuum
Previous Message David Rowley 2023-01-16 04:18:32 Re: Todo: Teach planner to evaluate multiple windows in the optimal order