Re: Parallel Apply

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Tomas Vondra <tomas(at)vondra(dot)me>, Andrei Lepikhov <lepihov(at)gmail(dot)com>, wenhui qiu <qiuwenhuifx(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Parallel Apply
Date: 2026-04-23 09:25:06
Message-ID: CAFiTN-twfExcQzVs3jhMBpP=VC1jv7J4+OqX8A9LCuJrTCoNcg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Apr 16, 2026 at 10:29 PM Zhijie Hou (Fujitsu)
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> On Friday, April 17, 2026 12:05 AM Zhijie Hou (Fujitsu) <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> >
> > On Tuesday, April 14, 2026 9:00 PM Kuroda, Hayato/黒田 隼人
> > <kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
> > >
> > > Other comments were addressed accordingly, please see attached patch set.
> >
> > I started reviewing patches 0001-0004 myself, aiming to add comments where
> > the design is not straightforward and to identify and fix any clearly incorrect
> > behavior.
> >
> > Here is the updated patch set with the following improvements:
> >
> > * Cosmetic changes in 0001-0004
> > * Additional comments in 0001-0004
> > * Code simplification by merging unnecessary static functions
> > * Removal of function exports left over from the POC version that are no
> > longer needed
> > * Got rid of XLogRecPtrIsInvalid()
> > * Fixed buggy behavior in partial serialization mode, including:
> > 1) The leader did not serialize the dependency on the last committed
> > transaction
> > 2) The parallel apply worker could not identify internal messages in
> > spooled changes
> > 3) An assertion failure in maybe_start_skipping_changes()
> > * Added one test for serialization and restore non-streaming transactions in
> > 0004.
> >
> > Thanks to Kuroda-San for discussing these changes internally with me.

I have started review the design and patches, couple of questions/suggestion

0001:
1. Looking at the commit message and patch, the motivation for
WORKER_INTERNAL_MSG_RELATION isn't very clear to me. It's clear what
it does, but the motivation isn't very clear to me.

2. +/*
+ * Wait for the given transaction to finish.
+ */
+void
+pa_wait_for_depended_transaction(TransactionId xid)
+{
+ elog(DEBUG1, "wait for depended xid %u", xid);
+
+ for (;;)
+ {
+ /* XXX wait until given transaction is finished */
+ }
+
+ elog(DEBUG1, "finish waiting for depended xid %u", xid);
+}

Does that mean the waiting logic isn't implemented yet?

3.
+ if (c == PqReplMsg_WALData)
+ {
+ /*
+ * Ignore statistics fields that have been updated by the
+ * leader apply worker.
+ *
+ * XXX We can avoid sending the statistics fields from the
+ * leader apply worker but for that, it needs to rebuild the
+ * entire message by removing these fields which could be more
+ * work than simply ignoring these fields in the parallel apply
+ * worker.
+ */
+ s.cursor += SIZE_STATS_MESSAGE;

- apply_dispatch(&s);
+ apply_dispatch(&s);
+ }

I could not understand how this change is relevant to patch 0001. This
patch implements two internal messages; why ignoring statistics fields
for non internal messages is relevant here?

--
Regards,
Dilip Kumar
Google

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Álvaro Herrera 2026-04-23 09:25:07 Re: Redundant/mis-use of _(x) gettext macro?
Previous Message Zhijie Hou (Fujitsu) 2026-04-23 09:07:22 RE: Parallel Apply