Re: BUG #13895: Hot Standby locked replaying auto vacuum against pg_catalog.pg_statistics

From: Greg Stark <stark(at)mit(dot)edu>
To: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Cc: PostgreSQL Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #13895: Hot Standby locked replaying auto vacuum against pg_catalog.pg_statistics
Date: 2016-01-29 08:44:19
Message-ID: CAM-w4HOds47dX4eKxr3V2RQdbBc6PbiZ1D0bk=jtRuhsuCN3nQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

We saw this at Heroku a few times, one of which was posted here iirc.

I think the problem isn't that the truncate takes a long time. It's that it
deadlocks against the same tuple cleanups, possibly on the same table. It
was always the statistics table there too.

Recovery holding locks makes sense, and recovery pausing to avoid cleaning
tuples that are still needed is fine, buy the two cab deadlock against each
other. We could represent tuple cleanup with a lock when we need to block
which would at least let the deadlock detector notice but I'm not sure what
it should do since it can't kill recovery...

You can mitigate the problem by raising the vacuum cleanup age or using
feedback.

--
Greg
On 28 Jan 2016 1:43 pm, <dimitri(at)2ndquadrant(dot)fr> wrote:

> The following bug has been logged on the website:
>
> Bug reference: 13895
> Logged by: Dimitri Fontaine
> Email address: dimitri(at)2ndQuadrant(dot)fr
> PostgreSQL version: 9.3.10
> Operating system: Linux
> Description:
>
> I witnessed a case where I had no time to extract information about it, so
> it's all from memory, sorry about that.
>
> We had several Hot Standby nodes in 9.3.x used for load balancing read only
> queries. All queries were stuck, and pg_locks showed they were refused a
> lock against pg_catalog.pg_statistics.
>
> The lock owner PID was not to be found in the pg_stat_activity system view,
> for it was the replay process.
>
> I don't remember how I got to the conclusion it was replaying specifically
> autocavuum activity, but if that's helpful, that is what I remember it was
> about.
>
> My todo list includes extracting useful information next time it happens,
> before we just restart the standby (a restart fixes the availability /
> situation).
>
> Regards,
>
>
> --
> Sent via pgsql-bugs mailing list (pgsql-bugs(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-bugs
>

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Michael Paquier 2016-01-29 12:13:09 Re: Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby
Previous Message Tom Lane 2016-01-29 08:25:54 Re: BUG #13899: \det ignores visibility; patch attached