| From: | Evgeny Voropaev <evgeny(dot)voropaev(at)tantorlabs(dot)com> |
|---|---|
| To: | Tomas Vondra <tomas(at)vondra(dot)me>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru> |
| Subject: | Re: Compress prune/freeze records with Delta Frame of Reference algorithm |
| Date: | 2026-04-08 12:34:42 |
| Message-ID: | 3fc9f40b-2c4f-49c7-bf1c-570c5b9e6e9e@tantorlabs.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Tomas, Andreus, Andrey, hello!
> A ~170kB patch really should present some numbers
> quantifying the expected benefit. It doesn't need to be a real workload
> from production, but something plausible enough. Even some basic
> back-of-the-envelope calculations might be enough to show the promise.
The patch results in reduction of WAL total size by:
81% during vacuuming a table having no index,
and by 55% during vacuuming a table having an index.
The numbers are the next:
=== VACUUM (table with no index) ===
-------------------- ----------------- ----------------- -----------
DFOR off, bytes DFOR on, bytes Reduction
-------------------- ----------------- ----------------- -----------
WAL total size 6743149 1184446 82%
Prune records size 6710185 1159723 5.8x
-------------------- ----------------- ----------------- -----------
=== VACUUM (table with index) ===
-------------------- ----------------- ----------------- -----------
DFOR off, bytes DFOR on, bytes Reduction
-------------------- ----------------- ----------------- -----------
WAL total size 20394208 8907090 56%
Prune records size 6812850 1225944 5.6x
-------------------- ----------------- ----------------- -----------
The logic of the tests is based on the technique from [1] and is the
next:
-- SQL
CREATE TABLE t_prune ( id int, val text )
WITH (fillfactor = 100, autovacuum_enabled = false);
INSERT INTO t_prune
SELECT g, 'x' FROM generate_series(1,3000000) g;
CREATE INDEX ON t_prune(id); -- for the test using an indexed table
DELETE FROM t_prune WHERE id % 500 <> 0;
SELECT pg_current_wal_flush_lsn(); -- get start_lsn here
VACUUM FREEZE t_prune; -- 3 times
SELECT pg_current_wal_flush_lsn(); -- get end_lsn here
# BASH
# stop cluster
pg_waldump -p $wal_dir -s $start_lsn -e $end_lsn 2>/dev/null;
The test is implemented in 052_prune_dfor_compression.pl, therefore
the presented results can be refetched by restarting this test script.
> Also, I find it somewhat unlikely we'd import a GPLv3 library like
> this, even if it's just a testing framework. Even ignoring the
> question of having a different license for some of the code, it'd mean
> maintenance burden (maybe libtap is stable/mature, no idea). I don't
> see why this would be better than "write a SQL callable test module".
I am ready to rework it once there is consensus on the core of the
patch.
Best regards,
Evgeny.
P.s. rebased onto a1643d40b30.
| Attachment | Content-Type | Size |
|---|---|---|
| v08-0001-Implement-vect-and-uniqsortvect-containers-and-b.patch | text/x-patch | 95.6 KB |
| v08-0002-Implement-Delta-Frame-of-Reference-compression.patch | text/x-patch | 42.1 KB |
| v08-0003-Use-Delta-Frame-of-Reference-DFoR-to-compress-pr.patch | text/x-patch | 35.6 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Andrew Dunstan | 2026-04-08 12:50:11 | Re: Buildfarm misses running some contrib TAP tests |
| Previous Message | Amit Kapila | 2026-04-08 11:54:52 | Re: PG 19 release notes and authors |