Re: improving wraparound behavior

From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: improving wraparound behavior
Date: 2019-05-04 02:06:20
Message-ID: 20190504020620.5n4m5wsjzoigs4qi@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2019-05-03 21:36:24 -0400, Robert Haas wrote:
> On Fri, May 3, 2019 at 8:45 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > Part of my opposition to just disabling it when close to a wraparound,
> > is that it still allows to get close to wraparound because of truncation
> > issues.
>
> Sure ... it would definitely be better if vacuum didn't consume XIDs
> when it truncates. On the other hand, only a minority of VACUUM
> operations will truncate, so I don't think there's really a big
> problem in practice here.

I've seen a number of out-of-xid shutdowns precisely because of
truncations. Initially because autovacuum commits suicide because
somebody else wants a conflicting lock, later because there's so much
dead space that people kill (auto)vacuum to get rid of the exclusive
locks.

> > IMO preventing getting closer to wraparound is more important
> > than making it more "comfortable" to be in a wraparound situation.
>
> I think that's a false dichotomy. It's impossible to create a
> situation where no user ever gets into a wraparound situation, unless
> we're prepared to do things like automatically drop replication slots
> and automatically roll back (or commit?) prepared transactions. So,
> while it is good to prevent a user from getting into a wraparound
> situation where we can, it is ALSO good to make it easy to recover
> from those situations as painlessly as possible when they do happen.

Sure, but I've seen a number of real-world cases of xid wraparound
shutdowns related to truncations, and no real world problem due to
truncations assigning an xid.

> > The second problem I see is that even somebody close to a wraparound
> > might have an urgent need to free up some space. So I'm a bit wary of
> > just disabling it.
>
> I would find that ... really surprising. If you have < 11 million
> XIDs left before your data gets eaten by a grue, and you file a bug
> report complaining that vacuum won't truncate your tables until you
> catch up on vacuuming a bit, I am prepared to offer you no sympathy at
> all.

I've seen wraparound issues triggered by auto-vacuum generating so much
WAL that the system ran out of space, crash-restart, repeat. And being
unable to reclaim space could make that even harder to tackle.

> > Wonder if there's a reasonable way that'd allow to do the WAL logging
> > for the truncation without using an xid. One way would be to just get
> > rid of the lock on the primary as previously discussed. But we could
> > also drive the locking through the WAL records that do the actual
> > truncation - then there'd not be a need for an xid. It's probably not a
> > entirely trivial change, but I don't think it'd be too bad?
>
> Beats me. For me, this is just a bug, not an excuse to redesign
> vacuum truncation. Before Hot Standby, when you got into severe
> wraparound trouble, you could vacuum all your tables without consuming
> any XIDs. Now you can't. That's bad, and I think we should come up
> with some kind of back-patchable solution to that problem.

I agree we need to do at least a minimal version that can be
backpatched.

I don't think we necessarily need a new WAL record for what I'm
describing above (as XLOG_SMGR_TRUNCATE already carries information
about which forks are truncated, we could just have it acquire the
exclusive lock), and I don't think we'd need a ton of code for eliding
the WAL logged lock either. Think the issue with backpatching would be
that we can't remove the logged lock, without creating hazards for
standbys running older versions of postgres.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2019-05-04 02:11:08 Re: improving wraparound behavior
Previous Message Stephen Frost 2019-05-04 02:03:18 Re: improving wraparound behavior