Re: XLog size reductions: smaller XLRec block header for PG17

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Aleksander Alekseev <aleksander(at)timescale(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, vignesh C <vignesh21(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject: Re: XLog size reductions: smaller XLRec block header for PG17
Date: 2024-02-02 13:52:50
Message-ID: 4a40ec9f-e254-41dc-8507-02e02a7c94f6@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 22/01/2024 19:23, Robert Haas wrote:
> In the case of this particular patch, I think the problem is that
> there's no consensus on the design. There's not a ton of debate on
> this thread, but thread [1] linked in the original post contains a lot
> of vigorous debate about what the right thing to do is here and I
> don't believe we reached any meeting of the minds.

Yeah, so it seems.

> It looks like I never replied to
> https://www.postgresql.org/message-id/20221019192130.ebjbycpw6bzjry4v%40awork3.anarazel.de
> but, FWIW, I agree with Andres that applying the same technique to
> multiple fields that are stored together (DB OID, TS OID, rel #, block
> #) is unlikely in practice to produce many cases that regress. But the
> question for this thread is really more about whether we're OK with
> using ad-hoc bit swizzling to reduce the size of xlog records or
> whether we want to insist on the use of a uniform varint encoding.
> Heikki and Andres both seem to favor the latter. IIRC, I was initially
> more optimistic about ad-hoc bit swizzling being a potentially
> acceptable technique, but I'm not convinced enough about it to argue
> against two very smart committers both of whom know more about
> micro-optimizing performance than I do, and nobody else seems to
> making this argument on this thread either, so I just don't really see
> how this patch is ever going to go anywhere in its current form.

I don't have a clear idea of how to proceed with this either. Some
thoughts I have:

Using varint encoding makes sense for length fields. The common values
are small, and if a length of anything is large, then the size of the
length field itself is insignificant compared to the actual data.

I don't like using varint encoding for OID. They might be small in
common cases, but it feels wrong to rely on that. They're just arbitrary
numbers. We could pick them randomly, it's just an implementation detail
that we use a counter to choose the next one. I really dislike the idea
that someone would do a pg_dump + restore, just to get smaller OIDs and
smaller WAL as a result.

It does make sense to have a fast-path (small-path?) for 0 OIDs though.

To shrink OIDs fields, you could refer to earlier WAL records. A special
value for "same relation as in previous record", or something like that.
Now we're just re-inventing LZ-style compression though. Might as well
use LZ4 or Snappy or something to compress the whole WAL stream. It's a
bit tricky to get the crash-safety right, but shouldn't be impossible.

Has anyone seriously considered implementing wholesale compression of WAL?

--
Heikki Linnakangas
Neon (https://neon.tech)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jelte Fennema-Nio 2024-02-02 14:03:39 Re: [EXTERNAL] Re: Add non-blocking version of PQcancel
Previous Message Thomas Munro 2024-02-02 13:42:46 Re: InstallXLogFileSegment() vs concurrent WAL flush