Re: Perform streaming logical transactions by background workers and parallel apply

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject: Re: Perform streaming logical transactions by background workers and parallel apply
Date: 2023-01-12 04:23:49
Message-ID: CAHut+PvA10Bp9Jaw9OS2+puKHr7ry_xB3Tf2-bbv5gyxD5E_gw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi, here are some review comments for patch v78-0001.

======

General

1. (terminology)

AFAIK everywhere until now we’ve been referring everywhere
(docs/comments/code) to the parent apply worker as the "leader apply
worker". Not the "main apply worker". Not the "apply leader worker".
Not any other variations...

From this POV I think the worker member "apply_leader_pid" would be
better named "leader_apply_pid", but I see that this was already
committed to HEAD differently.

Maybe it is not possible (or you don't want) to change that internal
member name but IMO at least all the new code and docs should try to
be using consistent terminology (e.g. leader_apply_XXX) where
possible.

======

Commit message

2.

main_worker_pid is Process ID of the leader apply worker, if this process is a
apply parallel worker. NULL if this process is a leader apply worker or a
synchronization worker.

IIUC, this text is just cut/paste from the monitoring.sgml. In a
review comment below I suggest some changes to that text, so then this
commit message should also change to be the same.

~~

3.

The new column can make it easier to distinguish leader apply worker and apply
parallel worker which is also similar to the 'leader_pid' column in
pg_stat_activity.

SUGGESTION
The new column makes it easier to distinguish parallel apply workers
from other kinds of workers. It is implemented this way to be similar
to the 'leader_pid' column in pg_stat_activity.

======

doc/src/sgml/logical-replication.sgml

4.

+ being synchronized. Moreover, if the streaming transaction is applied in
+ parallel, there will be additional workers.

SUGGESTION
there will be additional workers -> there may be additional parallel
apply workers

======

doc/src/sgml/monitoring.sgml

5. pg_stat_subscription

@@ -3198,11 +3198,22 @@ SELECT pid, wait_event_type, wait_event FROM
pg_stat_activity WHERE wait_event i

<row>
<entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>apply_leader_pid</structfield> <type>integer</type>
+ </para>
+ <para>
+ Process ID of the leader apply worker, if this process is a apply
+ parallel worker. NULL if this process is a leader apply worker or a
+ synchronization worker.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
<structfield>relid</structfield> <type>oid</type>
</para>
<para>
OID of the relation that the worker is synchronizing; null for the
- main apply worker
+ main apply worker and the parallel apply worker
</para></entry>
</row>

5a.

(Same as general comment #1 about terminology)

"apply_leader_pid" --> "leader_apply_pid"

~~

5b.

The current text feels awkward. I see it was copied from the similar
text of 'pg_stat_activity' but perhaps it can be simplified a bit.

SUGGESTION
Process ID of the leader apply worker if this process is a parallel
apply worker; otherwise NULL.

~~

5c.
BEFORE
null for the main apply worker and the parallel apply worker

AFTER
null for the leader apply worker and parallel apply workers

~~

5c.

<structfield>relid</structfield> <type>oid</type>
</para>
<para>
OID of the relation that the worker is synchronizing; null for the
- main apply worker
+ main apply worker and the parallel apply worker
</para></entry>

main apply worker -> leader apply worker

~~~

6.

@@ -3212,7 +3223,7 @@ SELECT pid, wait_event_type, wait_event FROM
pg_stat_activity WHERE wait_event i
</para>
<para>
Last write-ahead log location received, the initial value of
- this field being 0
+ this field being 0; null for the parallel apply worker
</para></entry>
</row>

BEFORE
null for the parallel apply worker

AFTER
null for parallel apply workers

~~~

7.

@@ -3221,7 +3232,8 @@ SELECT pid, wait_event_type, wait_event FROM
pg_stat_activity WHERE wait_event i
<structfield>last_msg_send_time</structfield> <type>timestamp
with time zone</type>
</para>
<para>
- Send time of last message received from origin WAL sender
+ Send time of last message received from origin WAL sender; null for the
+ parallel apply worker
</para></entry>
</row>

(same as #6)

BEFORE
null for the parallel apply worker

AFTER
null for parallel apply workers

~~~

8.

@@ -3230,7 +3242,8 @@ SELECT pid, wait_event_type, wait_event FROM
pg_stat_activity WHERE wait_event i
<structfield>last_msg_receipt_time</structfield>
<type>timestamp with time zone</type>
</para>
<para>
- Receipt time of last message received from origin WAL sender
+ Receipt time of last message received from origin WAL sender; null for
+ the parallel apply worker
</para></entry>
</row>

(same as #6)

BEFORE
null for the parallel apply worker

AFTER
null for parallel apply workers

~~~

9.

@@ -3239,7 +3252,8 @@ SELECT pid, wait_event_type, wait_event FROM
pg_stat_activity WHERE wait_event i
<structfield>latest_end_lsn</structfield> <type>pg_lsn</type>
</para>
<para>
- Last write-ahead log location reported to origin WAL sender
+ Last write-ahead log location reported to origin WAL sender; null for
+ the parallel apply worker
</para></entry>
</row>

(same as #6)

BEFORE
null for the parallel apply worker

AFTER
null for parallel apply workers

~~~

10.

@@ -3249,7 +3263,7 @@ SELECT pid, wait_event_type, wait_event FROM
pg_stat_activity WHERE wait_event i
</para>
<para>
Time of last write-ahead log location reported to origin WAL
- sender
+ sender; null for the parallel apply worker
</para></entry>
</row>
</tbody>

(same as #6)

BEFORE
null for the parallel apply worker

AFTER
null for parallel apply workers

======

src/backend/catalog/system_views.sql

11.

@@ -949,6 +949,7 @@ CREATE VIEW pg_stat_subscription AS
su.oid AS subid,
su.subname,
st.pid,
+ st.apply_leader_pid,
st.relid,
st.received_lsn,
st.last_msg_send_time,

(Same as general comment #1 about terminology)

"apply_leader_pid" --> "leader_apply_pid"

======

src/backend/replication/logical/launcher.c

12.

+ if (worker.apply_leader_pid == InvalidPid)
nulls[3] = true;
else
- values[3] = LSNGetDatum(worker.last_lsn);
- if (worker.last_send_time == 0)
+ values[3] = Int32GetDatum(worker.apply_leader_pid);
+

12a.

(Same as general comment #1 about terminology)

"apply_leader_pid" --> "leader_apply_pid"

~~

12b.

I wondered if here the code should be using the
isParallelApplyWorker(worker) macro here for readability.

e.g.

if (isParallelApplyWorker(worker))
values[3] = Int32GetDatum(worker.apply_leader_pid);
else
nulls[3] = true;

======

src/include/catalog/pg_proc.dat

13.

+ proallargtypes =>
'{oid,oid,oid,int4,int4,pg_lsn,timestamptz,timestamptz,pg_lsn,timestamptz}',
+ proargmodes => '{i,o,o,o,o,o,o,o,o,o}',
+ proargnames =>
'{subid,subid,relid,pid,apply_leader_pid,received_lsn,last_msg_send_time,last_msg_receipt_time,latest_end_lsn,latest_end_time}',

(Same as general comment #1 about terminology)

"apply_leader_pid" --> "leader_apply_pid"

======

src/test/regress/expected/rules.out

14.

@@ -2094,6 +2094,7 @@ pg_stat_ssl| SELECT s.pid,
pg_stat_subscription| SELECT su.oid AS subid,
su.subname,
st.pid,
+ st.apply_leader_pid,
st.relid,
st.received_lsn,
st.last_msg_send_time,
@@ -2101,7 +2102,7 @@ pg_stat_subscription| SELECT su.oid AS subid,
st.latest_end_lsn,
st.latest_end_time
FROM (pg_subscription su
- LEFT JOIN pg_stat_get_subscription(NULL::oid) st(subid, relid,
pid, received_lsn, last_msg_send_time, last_msg_receipt_time,
latest_end_lsn, latest_end_time) ON ((st.subid = su.oid)));
+ LEFT JOIN pg_stat_get_subscription(NULL::oid) st(subid, relid,
pid, apply_leader_pid, received_lsn, last_msg_send_time,
last_msg_receipt_time, latest_end_lsn, latest_end_time) ON ((st.subid
= su.oid)));
pg_stat_subscription_stats| SELECT ss.subid,
s.subname,
ss.apply_error_count,

(Same comment as elsewhere)

"apply_leader_pid" --> "leader_apply_pid"

------
Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2023-01-12 04:39:49 Re: Cygwin cleanup
Previous Message Thomas Munro 2023-01-12 04:03:02 Re: Using WaitEventSet in the postmaster