Re: Temporary tables prevent autovacuum, leading to XID wraparound

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Temporary tables prevent autovacuum, leading to XID wraparound
Date: 2018-02-09 07:34:27
Message-ID: CAD21AoBtuzV4M9=96epL9pEd0T-SmfMVYZum9FdUmVtwvDf8zA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Feb 3, 2018 at 1:48 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Fri, Feb 2, 2018 at 1:27 AM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>> Thank you for suggestion. It sounds more smarter. So it would be more
>> better if we vacuums database for anti-wraparound in ascending order
>> of relfrozenxid?
>
> Currently, we're doing it based on datfrozenxid. I was looking for a
> small, relatively safe change to improve things, which led to my
> proposal: continue using datfrozenxid when the database isn't being
> vacuumed, but when it is, substitute the datfrozenxid that we expect
> the database *will have* after the tables currently being vacuumed are
> finished for the actual datfrozenxid.
>
> On further reflection, though, I see problems. Suppose db1 has
> age(datfrozenxid) = 600m and two tables with age(relfrozenxid) of 600m
> and 400m. db2 has age(datfrozenxid) = 500m. Then suppose #1 starts
> vacuuming the table with age 600m. The AV launcher sees that, with
> that table out of the way, we'll be able to bring datfrozenxid forward
> to 400m and so sends worker #2 to db2. But actually, when the worker
> in db1 finishes vacuuming the table with age 600m, it's going to have
> to vacuum the table with age 400m before performing any relfrozenxid
> update. So actually, the table with age 400m is holding the
> relfrozenxid at 600m just as much as if it too had a relfrozenxid of
> 600m. In fact, any tables in the same database that need routine
> vacuuming are a problem, too: we're going to vacuum all of those
> before updating relfrozenxid, too.
>
> Maybe we should start by making the scheduling algorithm used by the
> individual workers smarter:
>
> 1. Let's teach do_autovacuum() that, when there are any tables needing
> vacuum for wraparound, either for relfrozenxid or relminmxid, it
> should vacuum only those and forget about the rest. This seems like
> an obvious optimization to prevent us from delaying
> datfrozenxid/datminmxid updates for the sake of vacuuming tables that
> are "merely" bloated.
>
> 2. Let's have do_autovacuum() sort the tables to be vacuumed in
> ascending order by relfrozenxid, so older tables are vacuumed first.
> A zero-order idea is to put tables needing relfrozenxid vacuuming
> before tables needing relminmxid vacuuming, and sort the latter by
> ascending relminmxid, but maybe there's something smarter possible
> there. The idea here is to vacuum tables in order of priority rather
> than in whatever order they happen to be physically mentioned in
> pg_class.
>
> 3. Let's have do_autovacuum() make an extra call to
> vac_update_datfrozenxid() whenever the next table to be vacuumed is at
> least 10 million XIDs (or MXIDs) newer than the first one it vacuumed
> either since the last call to vac_update_datfrozenxid() or, if none,
> since it started. That way, we don't have to wait for every single
> table to get vacuumed before we can consider advancing
> relfrozenxid/relminmxid.
>
> 4. When it's performing vacuuming for wraparound, let's have AV
> workers advertise in shared memory the oldest relfrozenxid and
> relminmxid that it might exist in the database. Given #1 and #2, this
> is pretty easy, since we start by moving through tables in increasing
> relfrozenxid order and then shift to moving through them in increasing
> relminmxid order. When we're working through the relfrozenxid tables,
> the oldest relminmxid doesn't move, and the oldest relfrozenxid is
> that of the next table in the list. When we're working through the
> relminmxid tables, it's the reverse. We need a little cleverness to
> figure out what value to advertise when we're on the last table in
> each list -- it should be the next-higher value, even though that will
> be above the relevant threshold, not a sentinel value.

So I think we should includes tables as well that are not at risk of
wraparound in order to get the next-higher value (that is, the oldest
table of the non-risked tables) instead of forgetting them. And then
we skip vacuuming them.

>
> 5. With those steps in place, I think we can now adopt my previous
> idea to have the AV launcher use any advertised relfrozenxid (and, as
> I now realize, relminmxid) instead of the ones found in pg_database,
> because now we know that individual workers are definitely focused on
> getting relfrozenxid (and/or relminmxid) as soon as possible, and
> vacuuming unrelated tables won't help them do it any faster.
>
> This gets us fairly close to vacuuming tables in decreasing order of
> wraparound danger across the entire cluster. It's not perfect. It
> prefers to keep vacuuming tables in the same database rather than
> having a worker exit and maybe launching a new one in a different
> database -- but the alternative is not very appealing. If we didn't
> do it that way, and if we had a million tables with XIDs that were
> closely spaced spread across different databases, we'd have to
> terminate and relaunching workers at a very high rate to get
> everything sorted out, which would be inefficient and annoying to
> program. Also, it keeps the existing hard prioritization of
> relfrozenxid over relminmxid, which could theoretically be wrong for
> some installation. But I think that might not be a big problem in
> practice, and it seems like that could be separately improved at
> another time.
>

I think this algorithm works fine and improves the current behavior.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Sergei Kornilov 2018-02-09 07:47:41 Re: Using scalar function as set-returning: bug or feature?
Previous Message Konstantin Knizhnik 2018-02-09 07:28:23 Using scalar function as set-returning: bug or feature?