Re: 64 bit transaction id

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Павел Ерёмин <shnoor111gmail(at)yandex(dot)ru>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: 64 bit transaction id
Date: 2019-12-06 13:30:28
Message-ID: CA+Tgmoa78Ug5Z_dNhEt=VX9mRn6obFbzHN1aKpFSuWrY2FVA3A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Nov 7, 2019 at 10:28 AM Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> The above is a very good summary of the constraints that have led to our
> current handling of XID wraparound. If we are concerned about excessive
> vacuum freeze overhead, why is the default autovacuum_freeze_max_age =
> 200000000 so low? That causes feezing when the pg_xact directory holds
> 200 million xids or 50 megabytes of xid status?
>
> As far as I understand it, we cause the database to stop writes when the
> xid counter gets within 2 billion xids of the current transaction
> counter, so 200 million is only 1/10th to that limit, and even then, I
> am not sure why we couldn't make it stop writes at 3 billion or
> something. My point is that increasing the default
> autovacuum_freeze_max_age value seems like an easy way to reduce vacuum
> freeze. (While, the visibility map helps avoid vacuum freeze from
> reading all heap pages, and we still need to read all index pages.)

Yeah, I've also wondered why this isn't higher by default, but it's a
somewhat tricky topic.

Three billion won't work, because it's deeply baked into PostgreSQL's
architecture that at most two billion XIDs are used at one time. For
comparison purposes, the four billion XIDs form a ring, so that from
the perspective of any individual XID, half of the XIDs are in the
future and the other half are in the past. If three billion XIDs were
in use simultaneously, say starting with XID 5 and ending with XID
3,000,000,004, then XID 5 would see XID 3,000,000,004 as being the
past rather than the future, while XID 1,500,000,000 would (correctly)
see XID 5 as in the past and XID 3,000,000,004 as in the future. So
XID comparison would not be transitive, which would break a lot of
code. Allowing at most two billion XIDs to be in use at one time fixes
this problem.

That doesn't mean we couldn't raise the setting. It just means that
the hard limit is two billion, not four billion. But, how high should
we raise it? The highest safe value depends on how many XIDs you'll
burn while the freezing vacuums are running, which depends on both the
size of the database and the rate of XID consumption, and those values
can be very different on different systems. I think most people could
get by with a significantly higher value, but even with the current
value I think there are probably some people who run out of XIDs, at
which point they can no longer write to the database. The higher we
make the default, the more people are going to have that problem. It's
true that a lot of people only hit the limit because something has
gone wrong, like they've forgotten about a prepared transaction or an
unused replication slot, but still, on high-velocity systems you can't
afford to cut it too close because you're still going to be burning
through XIDs while vacuum is running.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2019-12-06 13:31:43 Re: [PATCH] print help from psql when user tries to run pg_restore, pg_dump etc
Previous Message Jakob Egger 2019-12-06 13:25:46 Frontend/Backend Protocol: SSL / GSS Protocol Negotiation Problem