Re: Temporary tables prevent autovacuum, leading to XID wraparound

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Temporary tables prevent autovacuum, leading to XID wraparound
Date: 2018-01-30 17:46:12
Message-ID: CA+TgmoZRZV7DyU5vCGPTORRobv7CLyjiNKfsKz9No2wDCDQm1Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 29, 2018 at 1:33 AM, Tsunakawa, Takayuki
<tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com> wrote:
>> Since the above idea would be complex a bit, as an
>> alternated idea it might be better to specify the number of worker to launch
>> per database by a new GUC parameter or something. If the number of worker
>> running on the database exceeds that limit, the launcher doesn't select
>> the database even if the database is about to wraparound.
>
> I'm afraid the GUC would be difficult for the user to understand and tune.

I agree. It's autovacuum's job to do the correct thing. If it's not,
then we need figure out how to make it do the right thing. Adding a
GUC seems like saying we don't know what the right thing to do is but
we hope the user does know. That's not a good answer.

Unfortunately, I think a full solution to the problem of allocating AV
workers to avoid wraparound is quite complex. Suppose that database A
will force a shutdown due to impending wraparound in 4 hours and
database B will force a shutdown in 12 hours. On first blush, it
seems that we should favor adding workers to A. But it might be that
database A needs only 2 hours of vacuuming to avoid a shutdown whereas
B needs 12 hours. In that case, it seems that we ought to instead
favor adding workers to B. However, it might be the case that A has
more table coming do for wraparound 6 hours from now, and we need
another 15 hours of vacuuming to avoid that shutdown. That would
favor adding workers to A. Then again, it might be that A and B
already both have workers, and that adding yet another worker to A
won't speed anything up (because only large tables remain to be
processed and each has a worker already), whereas adding a worker to B
would speed things up (because it still has a big table that we could
immediately start to vacuum for wraparound). In that case, perhaps we
ought to instead add a worker to B. But, thinking some more, it seems
like that should cause autovacuum_vacuum_cost_limit to be reallocated
among the workers, making the existing vacuum processes take longer,
which might actually make a bad situation worse. It seems possible
that the right answer could be to start no new autovacuum worker at
all.

Given all of the foregoing this seems like a very hard problem. I
can't even articulate a clear set of rules for what our priorities
should be, and it seems that such rules would depend on the rate at
which we're consuming XIDs, how close we are in each database to a
wraparound shutdown, what tables exist in each database, how big the
not-all-frozen part of each one is, how big their indexes are, how
much they're holding back relfrozenxid, and which ones already have
workers, among other things. I think it's quite possible that we can
come up with something that's better than what we have now without
embarking on a huge project, but it's not going to be anywhere near
perfect because this is really complicated, and there's a real risk
that we'll just making some cases better and others worse rather than
actually coming out ahead overall.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2018-01-30 18:46:37 Re: JIT compiling with LLVM v9.0
Previous Message Oliver Ford 2018-01-30 17:16:35 Re: Add RANGE with values and exclusions clauses to the Window Functions