| From: | 陈宗志 <baotiao(at)gmail(dot)com> |
|---|---|
| To: | wenhui qiu <qiuwenhuifx(at)gmail(dot)com> |
| Cc: | Robert Treat <rob(at)xzilla(dot)net>, Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org |
| Subject: | Re: [PROPOSAL] Doublewrite Buffer as an alternative torn page protection to Full Page Write |
| Date: | 2026-02-27 11:43:34 |
| Message-ID: | CAGbZs7j2x4Ld7OztM1wrdXD9-PLwYpQXiw1SfKaws85nnHewcA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi wenhui,
Here are the latest benchmark results for the Double Write Buffer (DWB)
proposal. In this round of testing, I have included the two-phase
checkpoint batch fsync optimization and evaluated the impact of
wal_compression (lz4) on both FPW and DWB.
Test Environment:
- PostgreSQL: 19devel (with DWB patch applied)
- Hardware: Linux 5.10, x86_64
- Configuration:
* shared_buffers = 1GB
* max_wal_size = 32MB (to stress checkpoint frequency)
* wal_compression = lz4
* double_write_buffer_size = 128MB (for DWB mode)
- Workload: sysbench 1.1.0, 10 tables x 1,000,000 rows (~2.3GB dataset)
- Method: 16 threads, 60 seconds per run, each mode tested
independently (only one instance running at a time to eliminate
I/O contention).
Three modes compared:
- FPW: io_torn_pages_protection = full_pages (current default)
- DWB: io_torn_pages_protection = double_writes
- OFF: io_torn_pages_protection = off (no protection, baseline)
Results with wal_compression = lz4
----------------------------------
1. oltp_write_only (pure write transactions: UPDATE + DELETE + INSERT)
Mode TPS vs FPW vs OFF
---- ------ ------ ------
FPW 13,772 - -64.3%
DWB 20,660 +50.0% -46.5%
OFF 38,588 +180.2% -
2. oltp_update_non_index (single UPDATE per transaction)
Mode TPS vs FPW vs OFF
---- ------ ------ ------
FPW 59,427 - -57.5%
DWB 104,328 +75.6% -25.4%
OFF 139,870 +135.4% -
3. oltp_read_write (mixed: 70% reads + 30% writes)
Mode TPS vs FPW vs OFF
---- ------ ------ ------
FPW 6,232 - -9.0%
DWB 4,408 -29.3% -35.6%
OFF 6,845 +9.8% -
Results without wal_compression (for comparison)
------------------------------------------------
Workload FPW DWB DWB vs FPW
-------- ------ ------ ----------
oltp_write_only 9,651 22,111 +129.1%
oltp_update_non_index 48,624 98,356 +102.3%
oltp_read_write 5,414 5,275 -2.6%
Key Observations:
1. Write-heavy workloads: DWB outperforms FPW by +50% to +76% even
with lz4 compression enabled. Without lz4, the advantage grows
to +102% to +129% because uncompressed full-page images cause
severe WAL bloat.
2. lz4 compression significantly helps FPW: For oltp_write_only, lz4
boosts FPW from 9,651 to 13,772 TPS (+43%), while DWB sees minimal
change (22,111 -> 20,660). This is expected -- lz4 compresses the
8KB full-page images that FPW writes to WAL, but DWB doesn't
generate FPIs at all, so lz4 has little effect on DWB's WAL volume.
3. Read-heavy mixed workloads: DWB shows a regression (-29%) in
oltp_read_write with lz4. This workload is 70% reads with only 4
write operations per transaction, so FPW overhead is minimal.
Meanwhile, DWB incurs additional I/O overhead from writing pages
to the double write buffer file, which outweighs the WAL savings
in this scenario.
4. Batch fsync optimization is critical for DWB: The two-phase
checkpoint approach (batch all DWB writes in Phase 1 -> single
fsync -> data file writes in Phase 2) reduces checkpoint DWB
fsyncs from millions to ~hundreds. For example, in
oltp_write_only: 1,157,729 DWB page writes -> only 148 fsyncs.
Summary:
DWB provides substantial performance benefits for write-intensive
workloads with frequent checkpoints, which is the scenario where FPW
overhead is most pronounced. The advantage is most significant without
WAL compression (+100~130%), and remains strong (+50~76%) even with
lz4 enabled. For read-dominated mixed workloads, DWB currently shows
overhead that needs further optimization (reducing non-checkpoint
DWB fsync costs).
Regards,
Baotiao
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Thomas Munro | 2026-02-27 11:59:42 | A stack allocation API |
| Previous Message | Nisha Moond | 2026-02-27 11:31:14 | Re: Skipping schema changes in publication |