Quick Links

Re: Adding CommandID to heap xlog records

From:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To:	PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Cc:	Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, vignesh C <vignesh21(at)gmail(dot)com>, Ian Lawrence Barwick <barwick(at)gmail(dot)com>
Subject:	Re: Adding CommandID to heap xlog records
Date:	2023-02-28 13:52:26
Message-ID:	1ba2899e-77f8-7866-79e5-f3b7d1251a3e@iki.fi
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

I took another stab at this from a different angle, and tried to use
this to simplify logical decoding. The theory was that if we included
the command ID in the WAL records, we wouldn't need the separate
HEAP2_NEW_CID record anymore, and could remove much of the code in
reorderbuffer.c that's concerned with tracking ctid->(cmin,cmax)
mapping. Unfortunately, it didn't work out.

Here's one problem:

Insert with cmin 1
Commit
Delete the same tuple with cmax 2.
Abort

Even if we store the cmin in the INSERT record, and set it on the tuple
on replay, the DELETE overwrites it. That's OK for the original
transactions, because they only look at the cmin/cmax of their own
transaction, but it's a problem for logical decoding. If we see the
inserted tuple during logical decoding, we need the cmin of the tuple.

We could still just replace the HEAP2_NEW_CID records with the CIDs in
the heap INSERT/UPDATE/DELETE records, and use that information to
maintain the ctid->(cmin,cmax) mapping in reorderbuffer.c like we do
today. But that doesn't really simplify reorderbuffer.c much. Attached
is a patch for that, for the archives sake.

Another problem with that is that logical decoding needs slightly
different information than what we store on the tuples on disk. My
original motivation for this was for Neon, which needs the WAL replay to
restore the same CID as what's stored on disk, whether it's cmin, cmax
or combocid. But for logical decoding, we need the cmin or cmax, *not*
the combocid. To cater for both uses, we'd need to include both the
original cmin/cmax and the possible combocid, which again makes it more
complicated.

So unfortunately I don't see much opportunity to simplify logical
decoding with this. However, please take a look at the first two patches
attached. They're tiny cleanups that make sense on their own.

- Heikki

Attachment	Content-Type	Size
0001-Improve-comment-on-why-we-need-ctid-cmin-cmax-mappin.patch	text/x-patch	1.9 KB
0002-Remove-redundant-check-for-fast_forward.patch	text/x-patch	926 bytes
0003-Remove-combocid-field-in-logical-decode-that-was-jus.patch	text/x-patch	6.0 KB
0004-Include-command-ID-in-heapam-records-on-catalog-tabl.patch	text/x-patch	34.3 KB

In response to

Re: Adding CommandID to heap xlog records at 2023-01-31 17:48:49 from vignesh C

Responses

Improve comment on cid mapping (was Re: Adding CommandID to heap xlog records) at 2023-06-26 06:57:56 from Heikki Linnakangas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Maxim Orlov	2023-02-28 14:17:10	Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)
Previous Message	Robert Haas	2023-02-28 13:37:02	Re: Non-superuser subscription owners