RE: PG12 autovac issues

From: Justin King <kingpin867(at)gmail(dot)com>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: RE: PG12 autovac issues
Date: 2020-03-19 15:23:48
Message-ID: CAE39h23RTX1jkYjWc5tccv34HwwraizaCUxOmdQdPM+Zt5-2Qg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin pgsql-general

Hi Andres-

Thanks for the reply, answers below.

On Tue, Mar 17, 2020 at 8:19 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> Hi,
>
> On 2020-03-17 17:18:57 -0500, Justin King wrote:
> > As you can see in this table, there are only ~80K rows, but billions
> > of updates. What we have observed is that the frozenxid reaches the
> > 200M mark fairly quickly because of the amount of activity.
>
> And each of those updates is in a separate transaction? Is that
> required? I.e. any chance to perform multiple of those updates in one
> transaction?
>
> Have you considered just increasing the vacuum limit? It's mostly there
> because it can increase space usage a bit, but given today's systems its
> not a usually a problem unless you have hundreds of postgres instances
> on a single system. It's pretty common to just increase that to 1.8
> billion or so.

We have considered increasing the limit, but as I mentioned, the
problem that we're seeing is that (effectively) a autovac starts on a
system database (postgres, template1) and never completes, or
deadlocks, or something. This completely stops autovacs from running
until we manually intervene and run a VACUUM FREEZE -- at which point,
the autovacs resume. If we increase the vacuum limit and this
situation happens, we're going to be in real trouble.

> From a single stats snapshot we can't actually understand the actual xid
> consumption - is it actually the xid usage that triggers the vacuums?

We have looked at this and the xid consumption averages around 1250
xid/sec -- this is when we see the "aggressive" autovac kick off in
the logs. What I don't understand is why these xid's are being
consumed at this rate on the databases with no activity (postgres,
template1).

>
>
> What makes you think it is a problem that you have all these vacuums? If
> you actually update that much, and you have indexes, you're going want a
> lot of vacuums?
>

I actually don't think there's a problem with the vacuums (I was
mostly pointing out that they are very regular and not problematic).
The main problem I am having is that something is causing the
autovacuums to completely stop and require manual intervention to
resume -- and it seems to be when the "postgres" or "template1"
database hits the autovacuum_freeze_max_age.

>
> > What is interesting is that this happens with the 'postgres' and
> > 'template1' databases as well and there is absolutely no activity in
> > those databases.
>
> That's normal. They should be pretty darn quick in v12?

Yes, a manual VACUUM FREEZE of either database takes less than 1
second -- which is why it's perplexing that the autovac starts and
never seems to complete and prevents other autovacs from running.

>
> Greetings,
>
> Andres Freund

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Justin King 2020-03-19 15:25:25 Re: Re: PG12 autovac issues
Previous Message Sandeep Thakkar 2020-03-19 11:16:47 Re: PG v12.2 - Setting jit_above_cost is causing the server to crash

Browse pgsql-general by date

  From Date Subject
Next Message Justin King 2020-03-19 15:25:25 Re: Re: PG12 autovac issues
Previous Message Michael Paquier 2020-03-19 13:09:28 Re: RPMs from postgresql.org break CentOS/RHEL RPMs