RE: Parallel Apply

From: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To: shveta malik <shveta(dot)malik(at)gmail(dot)com>
Cc: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tomas Vondra <tomas(at)vondra(dot)me>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Andrei Lepikhov <lepihov(at)gmail(dot)com>, wenhui qiu <qiuwenhuifx(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Subject: RE: Parallel Apply
Date: 2026-06-04 06:58:31
Message-ID: TY4PR01MB17718A4AB7F564C8DB0D6F7DC94102@TY4PR01MB17718.jpnprd01.prod.outlook.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wednesday, May 13, 2026 5:32 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> On Wed, May 13, 2026 at 10:54 AM shveta malik <shveta(dot)malik(at)gmail(dot)com>
> wrote:
> >
> >
> > Please find a few comments for patch003 mostly:

Thanks for the comments!

> >
> > 1)
> > * depended on by other transactions. Entries are of type
> ParallelizedTxnEntry.
> > *
> > * dshash is used to enable dynamic shared memory allocation based on
> > the number
> > - * of transactions being applied in parallel.
> > + * of transactions being applied in parallel. Entries are of type
> > ParallelizedTxnEntry.
> > */
> > static dsa_area *parallel_apply_dsa_area = NULL; static dshash_table
> > *parallelized_txns = NULL;
> >
> > 'Entries are of type ParallelizedTxnEntry' repeated twice in this comment.

Removed.

> >
> >
> > 2)
> > cleanup_committed_replica_identity_entries:
> >
> > + if (!skipped_write && !XLogRecPtrIsValid(pos->local_end))
> > + continue;
> >
> > Perhaps a comment will help to indicate above checks means a txn not
> > yet finished.

Added.

> >
> > 3)
> > Can you please clarify the scope, life-span of entries in
> > parallelized_txns vs ParallelApplyTxnHash. Both have remote-xid field.
> > So at any point of time, do both tables will have same number of
> > entries or if entries in one has bigger life-span/scope as compared to
> > other? It will be good to briefly mention these atop the hash-tables.

Added comments for it atop of these hash tables.

> >
> > 4)
> > +/*
> > + * Hash table storing replica identity values for changes being
> > +applied in
> > + * parallel, along with the last transaction that modified each row.
> > ...
> > +static replica_identity_hash *replica_identity_table = NULL;
> >
> > Regarding 'along with the last transaction that modified each row', is
> > 'remote_xid' in ReplicaIdentityEntry is the last transaction that
> > modified this row or the one which is currently modifying it?

It could be either, depending on whether the transaction is still running or
has already finished. In this version, I've changed it to be the "latest"
transaction.

> >
> > 5)
> > Since we have added comments for rest for the fields for below
> > existing structure, do you think we can update comment for 'xid' as
> > well to say it is remote-one. It does not mention it anywhere in
> > comment.
> >
> > typedef struct ParallelApplyWorkerEntry { TransactionId xid; /* Hash
> > key -- must be first */
> >

Added.

> >
> > 6)
> > 003' commit message says about RI table entry:
> >
> > Entries are deleted when transactions committed by parallel workers
> > are gathered, or the number of entries exceeds the limit.
> > --
> > I don't understand what do we mean by "when transactions committed by
> > parallel workers are gathered". Can we please make it more
> > clear/elaborate.
> >

I modified this to be consistent with some detailed comments in the patch.

> > ~~
> >
> > Reviewing further.
> >
>
> Few more on 003:
>
>
> 7)
> I find check_and_record_ri_dependency() difficult to understand.
>
> a)
> As an example, this part:
>
> + /*
> + * Return if no entry exists, or if the current transaction was the
> + last one
> + * to modify the key.
> + */
> + if (!rientry || TransactionIdEquals(rientry->remote_xid,
> + new_depended_xid)) return;
>
> IMO, the second check makes sense if new_depended_xid is valid. If so, why
> don't we make it part of previous 'if
> (TransactionIdIsValid(new_depended_xid))' logic. If 'found' was true, we can
> check it inside that if-block and 'return' from there instead of proceeding
> further.
>
> Once we make above change, we can even move below logic inside previous
> 'if (TransactionIdIsValid(new_depended_xid))' block, as it looks strange that in
> previous if-block we are assigning InvalidTransactionId to 'rientry-
> >remote_xid' while we have valid new_depended_xid available there.
>
> + /*
> + * Update the new depended xid into the entry if valid, the new xid
> + could
> + * be invalid if the transaction will be applied by the leader itself
> + * which means all the changes will be committed before processing next
> + * transaction, so no need to be depended on.
> + */
> + if (TransactionIdIsValid(new_depended_xid))
> + rientry->remote_xid = new_depended_xid;

Right. The original code was written that way to save a few lines,
but I agree it's harder to understand. I've refactored this part to
make each section clearer.

>
>
> b)
> Also this part is not clear:
>
> + /*
> + * Return if RI key is NULL or is explicitly marked unchanged. The key
> + * value could be NULL in the new tuple of a update opertaion which
> + * means the RI key is not updated.
> + */
> + if (original_data->colstatus[i] == LOGICALREP_COLUMN_NULL ||
> + original_data->colstatus[i] == LOGICALREP_COLUMN_UNCHANGED)
> return;
>
> Why do have we 'return' here when one of the columns is NULL or
> unchanged? What happens to rest of the RI columns? Which scenario may hit
> this? It needs more comments to explain the scenario.

After testing, I found that the handling for unchanged toasted columns was
incorrect. When the RI key includes two columns and only the non-toasted column
is changed, the toasted column is not included in the new tuple of an UPDATE
change. This means we cannot obtain the complete RI key value to check for
dependencies when handling the new change. I fixed this by copying the toasted
value from the old tuple to the new one before dependency checking.

>
>
> 8)
> check_and_record_ri_dependency() has this comment and logic around
> invalid remote_xid:
>
> + /*
> + * Remove the entry if the transaction has been committed and no new
> + * dependency needs to be added.
> + */
> + else if (!TransactionIdIsValid(rientry->remote_xid))
> + {
> + free_replica_identity_key(rientry->keydata);
> + replica_identity_delete_item(replica_identity_table, rientry); }
>
> While find_all_dependencies_on_rel() has this assert:
> + Assert(TransactionIdIsValid(rientry->remote_xid));
>
> The first logic says that we may have Invalid remote_xid in existing entry in
> replica_identity_table while second logic has a sanity check while iterating
> the same hash-table that all entries must have valid remote_xid. Is the Assert
> correct? We might have Invalid remote_xid if txn is committed (done in
> check_and_append_xid_dependency).

It's correct. The remote_xid could be invalid only before being stored
in the table. The check_and_append_xid_dependency function wasn't very clear
about this, but after refactoring, I hope it's easier to understand now.

Here is the V19 patch set which addressed all the comments above.

Best Regards,
Hou zj

Attachment Content-Type Size
v19-0001-Introduce-a-shared-hash-table-to-store-paralleli.patch application/octet-stream 9.2 KB
v19-0010-Support-serializing-changes-to-disk-when-the-sen.patch application/octet-stream 16.3 KB
v19-0009-Support-dependency-tracking-via-foreign-keys.patch application/octet-stream 17.0 KB
v19-0008-Support-dependency-tracking-via-local-unique-ind.patch application/octet-stream 24.0 KB
v19-0006-Track-dependencies-for-streamed-transactions.patch application/octet-stream 10.1 KB
v19-0007-Wait-applying-transaction-if-one-of-user-defined.patch application/octet-stream 11.9 KB
v19-0005-support-2PC.patch application/octet-stream 13.4 KB
v19-0004-Parallel-apply-non-streaming-transactions.patch application/octet-stream 61.6 KB
v19-0002-Introduce-internal-messages-to-track-dependencie.patch application/octet-stream 12.7 KB
v19-0003-Introduce-a-local-hash-table-to-store-replica-id.patch application/octet-stream 34.7 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Enrique Sánchez 2026-06-04 07:00:23 Re: Extended statistics improvement: multi-column MCV missing values
Previous Message shveta malik 2026-06-04 06:58:24 Re: Proposal: Conflict log history table for Logical Replication