| From: | "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com> |
|---|---|
| To: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com> |
| Subject: | RE: Parallel Apply |
| Date: | 2025-10-31 10:36:29 |
| Message-ID: | OS7PR01MB149681D3033D17FEBFF3A2D6EF5F8A@OS7PR01MB14968.jpnprd01.prod.outlook.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Dear hackers,
> TODO - potential improvement to use shared hash table for tracking
> dependencies.
I measured the performance data for the shared hash table approach. Based on the result,
local hash table approach seems better.
Abstract
========
No good performance improvement was observed by the shared hash, it had 1-2% regression.
The trend was not changed by number of parallel apply workers.
Machine details
===============
Intel(R) Xeon(R) CPU E7-4890 v2 @ 2.80GHz CPU(s) :88 cores, - 503 GiB RAM
Used patch
==========
0001 is same as Hou posted on -hackers [1], and 0002 is the patch for shared hash.
0002 introduces a shared hash table dependency_dshash. 0002 introduces a shared
hash table dependency_dshash. Since the length of shared hash key must be fixed
value, it is computed from the replica identity of tuples. When the parallel apply
worker receives changes, it computes the hash key again and remember it by the list.
At the commit time it iterates the list and remove hash entries based on the keys.
0001 has the mechanism to clean up the local hash but it was removed.
Workload
========
Setup:
---------
Pub --> Sub
- Two nodes created in pub-sub synchronous logical replication setup.
- Both nodes have same set of pgbench tables created with scale=100.
- The Sub node is subscribed to all the changes from the Pub's pgbench tables
Workload Run:
--------------------
- Run built-in pgbench(simple-update)[2] only on Pub with #clients=40 and run duration=5 minutes
Results:
--------------------
Number of worker is changed to 4, 8 or 16. In any cases 0001 has better performance.
#worker = 4:
------------
0001 0001+0002 diff
TPS 14499.33387 14097.74469 3%
14361.7166 14359.87781 0%
14467.91344 14153.53934 2%
14451.8596 14381.70987 0%
14646.90346 14239.4712 3%
14530.66788 14298.33845 2%
14733.35987 14189.41794 4%
14543.9252 14373.21266 1%
14945.57568 14249.46787 5%
14638.6342 14125.87626 4%
AVE 14581.988979 14246.865608 2%
MEDIAN 14537.296540 14244.469536 2%
#worker=8
---------
0001 0001+0002 diff
TPS 21531.08712 21443.68765 0%
22337.60439 21383.94778 4%
21806.70504 21097.42874 3%
22192.99695 21424.78921 4%
21721.95472 21470.8714 1%
21450.6779 21265.89539 1%
21397.51433 21606.51486 -1%
21551.09391 21306.97061 1%
21455.89699 21351.38868 0%
21849.52528 21304.42329 3%
AVE 21729.505662 21365.591761 2%
MEDIAN 21636.524316 21367.668229 1%
#worker=16
-----------
0001 0001+0002 diff
TPS 28034.64652 28129.85068 0%
27839.10942 27364.40725 2%
27693.94576 27871.80199 -1%
27717.83971 27129.96132 2%
28453.25381 27439.77526 4%
28083.73208 27201.0004 3%
27842.19262 27226.43813 2%
27729.44205 27459.01256 1%
28103.76727 27385.80016 3%
27688.52482 27485.67209 1%
AVE 27918.645405 27469.371982 2%
MEDIAN 27840.651020 27412.787708 2%
[1]: https://www.postgresql.org/message-id/OS0PR01MB5716D43CB68DB8FFE73BF65D942AA%40OS0PR01MB5716.jpnprd01.prod.outlook.com
[2]: https://www.postgresql.org/docs/current/pgbench.html#PGBENCH-OPTION-BUILTIN
Best regards,
Hayato Kuroda
FUJITSU LIMITED
| Attachment | Content-Type | Size |
|---|---|---|
| v20251031-0001-Parallel-apply-non-streaming-transactions.patch | application/octet-stream | 73.1 KB |
| v20251031-0002-WIP-convert-the-hash-table-into-shared-one.patch | application/octet-stream | 26.8 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Gureumi | 2025-10-31 10:41:25 | [PATCH] Fix Korean typo 'checkpoint' in log |
| Previous Message | Jakub Wartak | 2025-10-31 09:58:59 | Re: pg_plan_advice |