| From: | shveta malik <shveta(dot)malik(at)gmail(dot)com> |
|---|---|
| To: | "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com> |
| Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Tomas Vondra <tomas(at)vondra(dot)me>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Andrei Lepikhov <lepihov(at)gmail(dot)com>, wenhui qiu <qiuwenhuifx(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com> |
| Subject: | Re: Parallel Apply |
| Date: | 2026-04-22 12:05:17 |
| Message-ID: | CAJpy0uAd=ihFvav0-ipztgEuJrfVYg_ofXRpkH3Fo11Cxt_uTw@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Wed, Apr 22, 2026 at 2:53 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> On Tue, Apr 21, 2026 at 4:49 PM Zhijie Hou (Fujitsu)
> <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> >
> > On Monday, April 20, 2026 7:06 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > On Fri, Apr 17, 2026 at 12:57 PM shveta malik <shveta(dot)malik(at)gmail(dot)com>
> > > wrote:
> > > >
> > > > 2)
> > > > Calling handle_dependency_on_change() from
> > > > handle_streamed_transaction() is misleading, since the former is
> > > > intended for non-streaming transactions, while the latter handles
> > > > streaming ones.
> > > >
> > >
> > > Can you first explain in which case, do we need to handle dependency
> > > for streamed transactions? IIUC, it is done in later patches, so we
> > > can move this part of code to later patches such that these should be
> > > able to handle stream transactions along with parallel-non-stream
> > > transactions.
> >
> > Yes, it's done in later patches, I moved the corresponding logic
> > to that patch in the updated version.
> >
> > >
> > > >
> > > > 5)
> > > > I started reading 002's commit message, I think it is not that clear.
> > > > I was trying to find if we have actual value for remote-xid which is
> > > > key to hash tbale. But I think it is hash-table for only xid as key
> > > > for faster access may be? If so, can we please improve commit messagee
> > > > little bit?
> > > >
> > >
> > > Right, and it is better to clarify if the transaction to wait is local
> > > or remote?
> >
> > Improved the commit message.
> >
> > >
> > >
> > > Few other comments:
> > > ===================
> > > 1.
> > > @@ -1916,7 +2015,106 @@ apply_handle_commit(StringInfo s)
> > > {
> > > ...
> > > + case TRANS_LEADER_PARTIAL_SERIALIZE:
> > > + Assert(winfo);
> > > +
> > > + /*
> > > + * Build a dependency with the last committed transaction if not
> > > + * already done.
> > > + */
> > > + if (apply_action != TRANS_LEADER_SEND_TO_PARALLEL)
> > > + build_dependency_with_last_committed_txn(winfo);
> > > +
> > > + stream_open_and_write_change(remote_xid,
> > > LOGICAL_REP_MSG_COMMIT,
> > > + &original_msg);
> > > +
> > > + pa_set_fileset_state(winfo->shared, FS_SERIALIZE_DONE);
> > > +
> > > + /* Finish processing the transaction. */
> > > + pa_xact_finish(winfo, commit_data.end_lsn);
> > >
> > > Can we move the serialize_to_file case handling as a separate patch,
> > > probably at the end, if possible? It will simplify the base patches
> > > and make them easier to review.
> >
> > Done as suggested.
> >
> > >
> > > 2.
> > > + /*
> > > + * The last remote transaction that modified the relation's schema or
> > > + * truncated the relation.
> > > + */
> > > + TransactionId last_depended_xid;
> > >
> > > It will be better to explain a bit on how it is used?
> >
> > Added.
> >
> > Here is updated patch set which addressed all comments so far.
> >
> > For 0001, I refactored the INTERNAL_MESSAGE handling to centralize the
> > processing of both internal and logical replication messages. We now add one bit
> > LOGICAL_REP_MSG_INTERNAL_MESSAGE to LogicalRepMsgType to indicate internal
> > messages. In apply_dispatch, the worker can then check the sub-internal
> > message type after reading LOGICAL_REP_MSG_INTERNAL_MESSAGE. This avoids
> > the maintenance burden of ensuring that sub-internal message types do not
> > conflict with LogicalRepMsgType values.
> >
> > Additionally, I've improved a few comments and the commit message based on
> > internal feedback from Peter Smith.
> >
>
>
> Thank You for the patch.
>
> Regarding 0001, I did not understand the need of having 2 separate messages:
>
> +#define PARALLEL_APPLY_INTERNAL_MESSAGE 'i'
> + LOGICAL_REP_MSG_INTERNAL_MESSAGE = 'i',
>
> And the need of sending both together in 0003:
>
> +send_internal_dependencies(ParallelApplyWorkerInfo *winfo, List
> *depends_on_xids)
> +{
> + pq_sendbyte(&dependencies, PARALLEL_APPLY_INTERNAL_MESSAGE);
> + pq_sendbyte(&dependencies, LOGICAL_REP_MSG_INTERNAL_MESSAGE);
>
>
> Also, it is confusing that above 2 are 'i' and
> WORKER_INTERNAL_MSG_RELATION is also 'i'. Code has become very tricky
> to understand now.
>
> Reviewing everything, I feel having 'i' outside of LogicalRepMsgType
> was better. I think it will eb better to retain
> PARALLEL_APPLY_INTERNAL_MESSAGE and getting rid of
> LOGICAL_REP_MSG_INTERNAL_MESSAGE. And when any worker intercepts
> PARALLEL_APPLY_INTERNAL_MESSAGE, it need not dispatch
> (apply_dispatch), instead it can handle it using
> apply_handle_internal_message()
>
> Goign above way:
> --Messaged received from pub can be handled using apply_dispatch.
> --Messages generated from leader to be handled separately/internally
> using apply_handle_internal_message().
>
> That way we have clear-cut boundary between the two types and less confusion.
>
> Also, can we use 'R' for WORKER_INTERNAL_MSG_RELATION similar to
> LOGICAL_REP_MSG_RELATION i.e. if 'i' is followed by 'R', then it means
> it is internal relation-msg instead of pub's one? 'R' seems a better
> choice over 'i' here.
>
> Thoughts?
>
> thanks
> Shveta
Few comments on v15-002:
v15-002:
1)
+/* An entry in the parallelized_txns shared hash table */
+typedef struct ParallelizedTxnEntry
+{
+ TransactionId xid; /* Hash key */
+} ParallelizedTxnEntry;
+
We can mention whether it is remote or local xid.
Same with below comment:
+/*
+ * A hash table used to track the parallelized transactions that could be
+ * depended on by other transactions.
+ */
2)
+/* parameters for the parallelized_txns shared hash table */
parameters --> Parameters
3)
+ else if (am_parallel_apply_worker())
+ {
+ /* Attach to existing dynamic shared hash table. */
+ parallel_apply_dsa_area =
dsa_attach(MyParallelShared->parallel_apply_dsa_handle);
+ dsa_pin_mapping(parallel_apply_dsa_area);
+ parallelized_txns = dshash_attach(parallel_apply_dsa_area, &dsh_params,
+ MyParallelShared->parallelized_txns_handle,
+ NULL);
+ }
Shall we have a sanity check to ensure
'MyParallelShared->parallel_apply_dsa_handle != DSA_HANDLE_INVALID' in
pa worker before invoking dsa_attach?
4)
pa_attach_parallelized_txn_hash() is done irrespective of txn type
(streaming/non-streaming), while handle_dependency_on_change() in 003
has this:
+ /* Compute dependency only for non-streaming transaction */
+ if (in_streamed_transaction || (winfo && winfo->stream_txn))
+ return;
I think both should be in sync in these initial patches. If we are
trying to setup parallel worker for non-streaming txn for the first
time, then we can initialize the shared-hash-table for dependency
tracking, else skip it. pa_launch_parallel_worker() can be changed to
accept 'stream_txn' argument which can then be used for this purpose.
thanks
Shveta
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Amit Kapila | 2026-04-22 12:13:01 | Re: Get rid of translation strings that only contain punctuation |
| Previous Message | Matheus Alcantara | 2026-04-22 11:42:23 | Re: MERGE PARTITIONS and DEPENDS ON EXTENSION. |