cmax docs seem misleading

From: Paul A Jungwirth <pj(at)illuminatedcomputing(dot)com>
To: pgsql-docs(at)lists(dot)postgresql(dot)org
Subject: cmax docs seem misleading
Date: 2026-01-20 18:46:01
Message-ID: CA+renyWVogpNSTug5e+PTMWmTOcj8UXsAhHuHiavsVU0rzNpUQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-docs

The docs for cmax say:[0]

> The command identifier within the deleting transaction, or zero.

This was true once upon a time, I think. But nowadays cmax and cmin
are the same physical field, and the user-facing system columns don't
seem to be trying to interpret it. For example:

[v19devel:5432][334102] regression=# create table pj (a int);
CREATE TABLE
[v19devel:5432][334102] regression=# begin; insert into pj values (1);
insert into pj values (2); commit;
BEGIN
INSERT 0 1
INSERT 0 1
COMMIT
[v19devel:5432][334102] regression=# select ctid, xmin, xmax, cmin,
cmax, * from pj;
ctid | xmin | xmax | cmin | cmax | a
-------+-------+------+------+------+---
(0,1) | 22424 | 0 | 0 | 0 | 1
(0,2) | 22424 | 0 | 1 | 1 | 2

So here you have a non-zero cmax for a not-deleted row.

The converse isn't true either. "Or zero" hints that deleted rows
might always have a non-zero value, but 0 is also just the first
command in the transaction. (Null would be a meaningful signal, but I
assume we don't want to do that.)

As far as I can tell, it is impossible to observe cmin <> cmax. From
heap_getsysattr (access/common/heaptuple.c):

case MinCommandIdAttributeNumber:
case MaxCommandIdAttributeNumber:

/*
* cmin and cmax are now both aliases for the same field, which
* can in fact also be a combo command id. XXX perhaps we should
* return the "real" cmin or cmax if possible, that is if we are
* inside the originating transaction?
*/
result =
CommandIdGetDatum(HeapTupleHeaderGetRawCommandId(tup->t_data));
break;

So it looks like these system columns also don't look up combocids.

I'm not interested in changing any of this, but I think we could clean
up the docs a little. The description for cmin is questionable too:

> The command identifier (starting at zero) within the inserting transaction.

That's true if the row hasn't been deleted yet, but then we overwrite the field.

Here is a patch to make both of these fields a little clearer, I
think. It could be improved further by some glossary entries
explaining what a command id is (and a combocid). Or maybe that's too
much information? And maybe we should be more drastic: combine cmin &
cmax into one entry, and explain that they are two names for the same
value, which might signify the insert cid, the delete cid, or a
combocid.

[0] https://www.postgresql.org/docs/current/ddl-system-columns.html#DDL-SYSTEM-COLUMNS-CMAX

Yours,

--
Paul ~{:-)
pj(at)illuminatedcomputing(dot)com

Attachment Content-Type Size
v1-0001-docs-Clarify-cmin-and-cmax-system-columns.patch text/x-patch 1.7 KB

Browse pgsql-docs by date

  From Date Subject
Next Message Alexey Shishkin 2026-01-21 09:04:42 Re: clarification for pg_basebackup and major versions
Previous Message Bruce Momjian 2026-01-20 04:10:38 Re: Fix improper xreflabels created for v18 release notes