Re: XLog size reductions: smaller XLRec block header for PG17

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: Aleksander Alekseev <aleksander(at)timescale(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, vignesh C <vignesh21(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject: Re: XLog size reductions: smaller XLRec block header for PG17
Date: 2024-02-02 15:42:54
Message-ID: CA+Tgmob2wDLyE6SeQ0XCED17RJfPyo4WLFL4GW13-ZK6Co67VA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Feb 2, 2024 at 8:52 AM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
> To shrink OIDs fields, you could refer to earlier WAL records. A special
> value for "same relation as in previous record", or something like that.
> Now we're just re-inventing LZ-style compression though. Might as well
> use LZ4 or Snappy or something to compress the whole WAL stream. It's a
> bit tricky to get the crash-safety right, but shouldn't be impossible.
>
> Has anyone seriously considered implementing wholesale compression of WAL?

I thought about the idea of referring to earlier WAL and/or undo
records when working on zheap. It seems tricky, because what if replay
starts after those WAL records and you can't refer back to them? It's
OK if you can make sure that you never depend on anything prior to the
latest checkpoint, but the natural way to make that work is to add
more looping like what we already do for FPIs, and then you have to
worry about whether that extra logic is going to be more expensive
than what you save. FPIs are so expensive that we can afford to go to
a lot of trouble to avoid them and still come out ahead, but storing
the full OID instead of an OID reference isn't nearly in the same
category.

I also thought about trying to refer to earlier items on the same
page, thinking that the locality would make things easier. But it
doesn't, because we don't know which page will ultimately contain the
WAL record until quite late, so we can't reason about what's on the
same page when constructing it.

Wholesale compression of WAL might run into some of the same issues,
e.g. if you don't want to compress each record individually, that
means you can't compress until you know the insert position. And even
then, if you want the previous data to be present in the compressor as
context, you almost need all the WAL compression to be done by a
single process. But if the LSNs are relative to the compressed stream,
you have to wait for that compression to finish before you can
determine the LSN of the next record, which seems super-painful, and
if they're relative to the uncompressed stream, then mapping them onto
fixed-size files gets tricky.

My hunch is that we can squeeze more out of the existing architecture
with a lot less work than it would take to do major rearchitecture
like compressing everything. I don't know how we can agree on a way of
doing that because everybody's got slightly different ideas about the
right way to do this. But if agreeing on how to evolve the system
we've got seems harder then rewriting it, we need to stop worrying
about WAL overhead and learn how to work together better.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Lakhin 2024-02-02 16:00:01 Re: cataloguing NOT NULL constraints
Previous Message Alvaro Herrera 2024-02-02 15:05:56 Re: [EXTERNAL] Re: Add non-blocking version of PQcancel