Re: Refactor code around GUC default_toast_compression

From: Ayush Tiwari <ayushtiwari(dot)slg01(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Refactor code around GUC default_toast_compression
Date: 2026-05-01 09:14:07
Message-ID: CAJTYsWX2erCHdEV1NoPdGP5fOvPjHYf=MQ2C-N+h6yE45oKihA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Fri, 1 May 2026 at 13:21, Michael Paquier <michael(at)paquier(dot)xyz> wrote:

> Hi all,
>
> While hacking on the TOAST code, I have been annoyed more than once
> with the following piece in toast_compression.h:
> /*
> * Built-in compression method ID. The toast compression header will store
> * this in the first 2 bits of the raw length. These built-in compression
> * method IDs are directly mapped to the built-in compression methods.
> *
> * Don't use these values for anything other than understanding the meaning
> * of the raw bits from a varlena; in particular, if the goal is to
> identify
> * a compression method, use the constants TOAST_PGLZ_COMPRESSION, etc.
> * below. We might someday support more than 4 compression methods, but
> * we can never have more than 4 values in this enum, because there are
> * only 2 bits available in the places where this is stored.
> */
> typedef enum ToastCompressionId
> {
> TOAST_PGLZ_COMPRESSION_ID = 0,
> TOAST_LZ4_COMPRESSION_ID = 1,
> TOAST_INVALID_COMPRESSION_ID = 2,
> } ToastCompressionId;
>
> This is due the fact that we have only two bits that can be used in
> va_tcinfo or va_extinfo. While looking at the addition of a new
> compression method, this was causing a mess, so I have hacked the
> attached patch, that makes the addition of more compression methods
> easier. The idea is centralized in toast_compression.c, with the
> addition of a registry that knows about all the TOAST compression
> methods and its meta-data:
> - name
> - GUC enum values.
> - attcompression char value.
> - varatt on-disk value.
>
>
I looked at the patch, and the refactoring direction looks reasonable
to me. I noticed a few small things worth cleaning up (although
it is for v20, just wanted to drop it here for future)

1. The patch includes an unrelated hunk in
doc/src/sgml/ref/alter_index.sgml, adding text about `ALTER INDEX ...
ATTACH PARTITION`. That looks like an accidental carry-over from another
patch and shouldn't be there ig.

2. The comment in src/include/access/toast_compression.h describing
default_toast_compression looks stale after this change? It still says
that the GUC value is one of the char values stored in
pg_attribute.attcompression, but the patch changes it to use the new
ToastCompressionGucValue enum values instead.(Maybe I'm
missing something)

3. One minor point: CompressionIdToMethod() seems to be added as a public
helper, but I could not find any callers in this patch. Also,
pg_column_compression() still keeps its own cmid-to-name switch. If the
intent is to centralize these mappings in the registry, perhaps that code
could use the new helper path as well, otherwise the unused helper may
not
be necessary yet (though it might be in future).

Thanks for the patch!

Regards,
Ayush

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Salma El-Sayed 2026-05-01 09:52:54 [GSoC 2026] - B-tree Index Bloat Reduction - Introduction
Previous Message Richard Guo 2026-05-01 08:25:13 Re: Remove inner joins based on foreign keys