Re: More guidance on ctid

From: Bernice Southey <bernice(dot)southey(at)gmail(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: pgsql-docs(at)lists(dot)postgresql(dot)org
Subject: Re: More guidance on ctid
Date: 2025-11-29 22:45:45
Message-ID: CAEDh4nzDMoRDZMDGGxKHzJ5hZscmvT2FWioTzQ6A0aJVtCO8NQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-docs

Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> I think the reporter above extrapolated this example to use ctid in
> other, non-appropriate cases. Should we add a warning to the docs to
> explain that the general use of ctid is discouraged?

I've been explaining this badly. Here's an example that I hope will
make clear why I'm so concerned.

(Read Committed default isolation level)
drop table if exists b1, b2;
create table b1 as select 1 id, 100 balance;
create table b2 as select 1 id, 100 balance;

With an immutable column self-join, it works as expected and
serializes the balance changes.

--session1
begin;
with x as (select id from b1 order by id for update)
update b1 set balance = balance + 1 from x where b1.id = x.id;

--session2
with x as (select id from b1 order by id for update)
update b1 set balance = balance - 100 from x where b1.id = x.id;
select * from b1;

--session1
commit;

--session2
UPDATE 1
id | balance
----+---------
1 | 1
(1 row)

But with ctid, the second update is lost.

--session1
begin;
with x as (select ctid from b2 order by id for update)
update b2 set balance = balance + 1 from x where b2.ctid = x.ctid;

--session2
with x as (select ctid from b2 order by id for update)
update b2 set balance = balance - 100 from x where b2.ctid = x.ctid;
select * from b2;

--session1
commit;

--session2
UPDATE 0
id | balance
----+---------
1 | 101
(1 row)

The session2 CTE blocks until it gets the updated version of the row
from session 1 commit. It gets b2.ctid = (0,2) giving x.ctid = (0,2).
But the UPDATE gets b2.ctid = (0,1) at the start of the transaction.
This doesn't change even after session 1 commits. The join fails. This
is correct, but not obvious.

This is why I think the docs should say something different to what
they currently do. The UPDATE example is fine only because it's run
repeatedly until there are no more rows to find.

I learned the primary key self-join pattern for order by/limit in
delete/update soon after I began with postgres. I'm pretty sure ctid
would have confused me. So I think the doc examples with an id column
will be very helpful, and safer.

If users discover ctid on internet forums or LLMs and then check the
docs, I think they should get some notice of the concurrency risk.

Here's a patch with an attempt to do the above.

Forum examples:
https://stackoverflow.com/questions/10245560/deadlocks-in-postgresql-when-running-update/71163671#71163671
https://www.reddit.com/r/PostgreSQL/comments/1cn0q1c/comment/l36ppve/

Thanks, Bernice

Attachment Content-Type Size
v1-add-ctid-guidance-and-remove-from-examples.patch text/x-patch 3.1 KB

In response to

Browse pgsql-docs by date

  From Date Subject
Next Message Daniel Gustafsson 2025-12-02 09:36:30 restart point vs restartpoint in the docs
Previous Message Andrew Jackson 2025-11-27 18:55:11 Add Restart=on-failure To Example Systemd File