Re: lifetime of the old CTID

From: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
To: Matthias Apitz <guru(at)unixarea(dot)de>, Christophe Pettus <xof(at)thebuild(dot)com>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: lifetime of the old CTID
Date: 2022-07-06 06:41:58
Message-ID: 193fd8fb631c7e8078de9d25bffcb1a04f6e1943.camel@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, 2022-07-06 at 08:18 +0200, Matthias Apitz wrote:
> El día martes, julio 05, 2022 a las 10:44:23p. m. -0700, Christophe Pettus escribió:
>
> >
> >
> > > On Jul 5, 2022, at 22:35, Matthias Apitz <guru(at)unixarea(dot)de> wrote:
> > > Internally, in the DB layer, the read_where() builds the row list matching
> > > the WHERE clause as a SCROLLED CURSOR of
> > >
> > >    SELECT ctid, * FROM d01buch WHERE ...
> > >
> > > and each fetch() delivers the next row from this cursor. The functions
> > > start_transaction() and end_transaction() do what their names suggest and
> > > rewrite_actual_row() does a new SELECT based on the ctid of the actual row
> > >
> > >    SELECT * FROM d01buch WHERE ctid = ... FOR UPDATE
> > >    ...
> > >    UPDATE ...
> >
> > On first glance, it appears that you are using the ctid as a primary key for a row, and that's highly not-recommended.  The ctid is never intended to be stable in the database, as you have
> > discovered.  There are really no particular guarantees about ctid values being retained.
> >
> > I'd suggest having a proper primary key column on the table, and using that instead.
>
> Ofc, each table has its own primary key(s), used for example for the
> SELECT ctid, * FROM d01buch WHERE ...
>
> As I said, we came to PostgreSQL from Sybase (and Oracle) and Sybase has
> for each table a so called SYB_IDENTITY_COLUMN which is static for the
> table and its value does not change. When we would add now to some 400 tables an
> additional INTEGER column (and triggers to fill this on INSERT) this
> would be a big change in our DB layer and migration of databases in the
> field. Your suggesting (thanks for it in any case) is not that easy to
> implement, and no option at the moment.
>
> At the moment the DB layer informs the application layer correctly when
> the row can not be found by the ctid for an UPDATE and the application
> must handle this situation correctly (logging, 2nd pass through with a
> new CURSOR etc.).
>
> If I understand Laurenz correctly, there seems to be a way to keep the
> tuple old-ctid : new-ctid for some time, at least until the ney
> autovacuum.

No, there isn't. I said that you might be able to mutilate the PostgreSQL
code so that it does that, but I think that would be a bad idea.

Using the primary key is the proper solution. To be safe from concurrent
modifications, use a logic like in this pseudo-code:

FOR b IN SELECT pk, other_columns FROM books WHERE condition
UPDATE books SET ... WHERE pk = ... AND condition

Checking the condition again on the inner UPDATE will detect concurrent
data modifications. If the UPDATE changes nothing, then a book has been
removed or updated by a concurrent transaction, and you ignore it.

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Laurenz Albe 2022-07-06 08:24:14 Re: How to upgrade postgres version 8 to 13
Previous Message Matthias Apitz 2022-07-06 06:18:42 Re: lifetime of the old CTID