Re: [HACKERS] Custom compression methods

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, David Steele <david(at)pgmasters(dot)net>, Ildus Kurbangaliev <i(dot)kurbangaliev(at)gmail(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [HACKERS] Custom compression methods
Date: 2021-03-20 00:25:43
Message-ID: 4193854.1616199943@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> writes:
> On 2021-Mar-19, Robert Haas wrote:
>> Well, I really do hope that some day in the bright future, pglz will
>> no longer be the thing we're shipping as the postgresql.conf default.
>> So we'd just be postponing the noise until then. I think we need a
>> better idea than that.

> Hmm, why? In that future, we can just change the pg_dump behavior to no
> longer dump the compression clause if it's lz4 or whatever better
> algorithm we choose. So I think I'm clarifying my proposal to be "dump
> the compression clause if it's different from the compiled-in default"
> rather than "different from the GUC default".

Extrapolating from the way we've dealt with similar issues
in the past, I think the structure of pg_dump's output ought to be:

1. SET default_toast_compression = 'source system's value'
in among the existing passel of SETs at the top. Doesn't
matter whether or not that is the compiled-in value.

2. No mention of compression in any CREATE TABLE command.

3. For any column having a compression option different from
the default, emit ALTER TABLE SET ... to set that option after
the CREATE TABLE. (You did implement such a SET, I trust.)

This minimizes the chatter for the normal case where all or most
columns have the same setting, and more importantly it allows the
dump to be read by older PG systems (or non-PG systems, or newer
systems built without --with-lz4) that would fail altogether
if the CREATE TABLE commands contained compression options.
To use the dump that way, you do have to be willing to ignore
errors from the SET and the ALTERs ... but that beats the heck
out of having to manually edit the dump script to get rid of
embedded COMPRESSION clauses.

I'm not sure whether we'd still need to mess around beyond
that to make the buildfarm's existing upgrade tests happy.
But we *must* do this much in any case, because as it stands
this patch has totally destroyed some major use-cases for
pg_dump.

There might be scope for a dump option to suppress mention
of compression altogether (comparable to, eg, --no-tablespaces).
But I think that's optional. In any case, we don't want
to put people in a position where they should have used such
an option and now they have no good way to recover their
dump to the system they want to recover to.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2021-03-20 00:29:32 Re: [PATCH] Identify LWLocks in tracepoints
Previous Message Andres Freund 2021-03-20 00:21:52 Re: shared memory stats: high level design decisions: consistency, dropping