| From: | "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com> |
|---|---|
| To: | shveta malik <shveta(dot)malik(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com> |
| Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tomas Vondra <tomas(at)vondra(dot)me>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Andrei Lepikhov <lepihov(at)gmail(dot)com>, wenhui qiu <qiuwenhuifx(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
| Subject: | RE: Parallel Apply |
| Date: | 2026-04-30 14:39:24 |
| Message-ID: | TYRPR01MB14195F002941AF23A1350F00D94352@TYRPR01MB14195.jpnprd01.prod.outlook.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Wednesday, April 29, 2026 2:51 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> On Mon, Apr 27, 2026 at 4:10 PM Hayato Kuroda (Fujitsu)
> <kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
> >
> > Dear hackers,
> >
> > Here is new patch set. The Option 2 was chosen for the format of
> > internal messages based on the discussion.
> >
>
> Thanks. Please find a few comments:
Thanks for the comments.
> 5)
> Now there are 3 types of hash-tables:
>
> HTAB used for existing ParallelApplyTxnHash dynamic-hash for
> parallelized_txns simplehash for replica_identity_table
>
> a)
> What was the design rationale behind choosing different hash table
> implementations for these different uses? Is there a comment, prior discussion,
> or email thread where this reasoning has been explained?
For parallelized_txns, the hash table must reside in shared memory and needs to
grow dynamically based on the number of transactions. dshash is the only natural choice
here: simplehash does not seem to be designed for shared memory, and HTAB (which
does support shared memory) cannot expand its size on the fly.
For replica_identity_table, we need maximum efficiency since every change
requires a hash operation. simplehash is the better choice here. See the
comments atop simplehash.h for details.
For ParallelApplyTxnHash (used per large transaction with a simple XID key),
we prioritize ease of setup, so HTAB is sufficient.
I've added brief comments for new hash tables in the patch set.
>
> b)
> Regarding names, was it a conscious decision to name parallelized_txns and
> replica_identity_table in snake_case case different from existing
> 'ParallelApplyTxnHash'? I know other code using DSA often follows snake_case,
> but having ParallelApplyTxnHash and parallelized_txns defined next to each
> other feels a bit inconsistent. Also, I wonder whether hash table objects should
> perhaps be named more like ParallelApplyTxnHash, so they stand out while
> reading the code; otherwise they mostly look like ordinary local variables,
> which they are not.
I don't have a strong opinion on the naming of the hash tables. I'll wait to
see if others have suggestions, as renaming is easy to address once the
patches become stable.
>
> 6)
> cleanup_committed_replica_identity_entries:
>
> + if (!skipped_write && !XLogRecPtrIsValid(pos->local_end))
> + continue;
>
> We can use XLogRecPtrIsInvalid instead.
XLogRecPtrIsInvalid() is a deprecated macro, so we should avoid using it.
Other comments have been addressed in the latest patch set.
Best Regards,
Hou zj
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Zhijie Hou (Fujitsu) | 2026-04-30 14:40:17 | RE: Parallel Apply |
| Previous Message | Zhijie Hou (Fujitsu) | 2026-04-30 14:38:45 | RE: Parallel Apply |