Re: Add 64-bit XIDs into PostgreSQL 15

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Chris Travers <chris(dot)travers(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, Fedor Sigaev <teodor(at)sigaev(dot)ru>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Aleksander Alekseev <afiskon(at)gmail(dot)com>, Konstantin Knizhnik <knizhnik(at)garret(dot)ru>, Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>, Maxim Orlov <orlovmg(at)gmail(dot)com>, Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>
Subject: Re: Add 64-bit XIDs into PostgreSQL 15
Date: 2022-11-24 19:25:45
Message-ID: CAH2-Wzm-YAp2bfJ9pH8jmu+55jmrdT38j8FLYE7eh9T5Q0ykzw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Nov 20, 2022 at 11:58 PM Chris Travers <chris(dot)travers(at)gmail(dot)com> wrote:
> I can start by saying I think it would be helpful (if the other issues are approached reasonably) to have 64-bit xids, but there is an important piece of context in reventing xid wraparounds that seems missing from this patch unless I missed something.
>
> XID wraparound is a symptom, not an underlying problem. It usually occurs when autovacuum or other vacuum strategies have unexpected stalls and therefore fail to work as expected. Shifting to 64-bit XIDs dramatically changes the sorts of problems that these stalls are likely to pose to operational teams. -- you can find you are running out of storage rather than facing an imminent database shutdown. Worse, this patch delays the problem until some (possibly far later!) time, when vacuum will take far longer to finish, and options for resolving the problem are diminished. As a result I am concerned that merely changing xids from 32-bit to 64-bit will lead to a smaller number of far more serious outages.

This is exactly what I think (except perhaps for the part about having
fewer outages overall). The more transaction ID space you need, the
more space you're likely to need in the near future.

We can all agree that having more runway is better than having less
runway, at least in some abstract sense, but that in itself doesn't
help the patch series very much. The first time the system-level
oldestXid (or database level datminfrozenxid) attains an age of 2
billion XIDs will usually *also* be the first time it attains an age
of (say) 300 million XIDs. Even 300 million is usually a huge amount
of XID space relative to (say) the number of XIDs used every 24 hours.
So I know exactly what you mean about just addressing a symptom.

The whole project seems to just ignore basic, pertinent questions.
Questions like: why are we falling behind like this in the first
place? And: If we don't catch up soon, why should we be able to catch
up later on? Falling behind on freezing is still a huge problem with
64-bit XIDs.

Part of the problem with the current design is that table age has
approximately zero relationship with the true cost of catching up on
freezing -- we are "using the wrong units", in a very real sense. In
general we may have to do zero freezing to advance a table's
relfrozenxid age by a billion XIDs, or we might have to write
terabytes of FREEZE_PAGE records to advance a similar looking table's
relfrozenxid by just one single XID (it could also swing wildly over
time for the same table). Which the system simply doesn't try to
reason about right now.

There are no settings for freezing that use physical units, and frame
the problem as a problem of being behind by this many unfrozen pages
(they are all based on XID age). And so the problem with letting table
age get into the billions isn't even that we'll never catch up -- we
actually might catch up very easily! The real problem is that we have
no way of knowing ahead of time (at least not right now). VACUUM
should be managing the debt, *and* the uncertainty about how much debt
we're really in. VACUUM needs to have a dynamic, probabilistic
understanding of what's really going on -- something much more
sophisticated than looking at table age in autovacuum.c.

One reason why you might want to advance relfrozenxid proactively is
to give the system a better general sense of the true relationship
between logical XID space and physical freezing for a given table and
workload -- it gives a clearer picture about the conditions in the
table. The relationship between SLRU space and physical heap pages and
the work of freezing is made somewhat clearer by a more proactive
approach to advancing relfrozenxid. That's one way that VACUUM can
lower the uncertainty I referred to.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2022-11-24 19:32:35 Re: TAP output format in pg_regress
Previous Message Daniel Gustafsson 2022-11-24 19:07:43 Re: TAP output format in pg_regress