Re: PostgreSQL hot standby Hangs due to AccessExclusiveLock on pg_attribute or pg_type tables

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: bruno(dot)richard(at)broadsign(dot)com, Pg Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: PostgreSQL hot standby Hangs due to AccessExclusiveLock on pg_attribute or pg_type tables
Date: 2017-07-11 16:39:17
Message-ID: CAMkU=1zV8rvpUUXd7Bgt53HtOEtBnUXok76RwfWAJBprO-+5Bg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Tue, Jul 11, 2017 at 3:01 AM, Kyotaro HORIGUCHI <
horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:

> Hello.
>
> In conclusion, this doesn't seem to be a bug but just a standby
> conflict.
>
> At Mon, 19 Jun 2017 13:29:34 -0400, Bruno Richard <
> bruno(dot)richard(at)broadsign(dot)com> wrote in <CAB-EU3RawZx8-OzMfvswFf6z+Y7G
> OZf03TZ=bez+PbQX+A4M7Q(at)mail(dot)gmail(dot)com>
> > *Summary:*
> > PostgreSQL hot standby Hangs due to AccessExclusiveLock on pg_attribute
> or
> > pg_type tables
> >
> > *Platform information (all instances)*
> > PostgreSQL 9.6.2, ltree extension enabled.
> > Ubuntu 12.04 64bits
> >
> > *Description:*
> > We have an issue on our Database system since we migrated from PostgreSQL
> > 9.3.2 to 9.6.2.
> > The issue is that one (or many) of our hot standby instances does an
> > AccessExclusiveLock on the pg_attribute or pg_type system tables that
> never
> > gets unlocked, causing hangs on our systems. We have to restart the
> > instance to get out of the lock. It happened ~30 times since end of
> April.
> >
> > *Steps to reproduce:*
> > Looks like a race condition, happens randomly in production only. I am
> > trying to replicate in house.
> >
> > *Description:*
> > This is the output from the pg_locks table when the issue occurs:
> >
> > -[ RECORD 129 ]----+--------------------
> > locktype | relation
> > database | 16385
> > relation | 1249
> > page |
> > tuple |
> > virtualxid |
> > transactionid |
> > classid |
> > objid |
> > objsubid |
> > virtualtransaction | 1/0
> > pid | 19018
> > mode | AccessExclusiveLock
> > granted | t
> > fastpath | f
>
> I'm not sure why it didn't happen on 9.3,

In 9.3, the AccessExclusiveLock is obtained by polling. If it can never
get the lock, it won't stop other processes from getting an
AccessShareLock, so it won't block their ability to log on. (Once
max_standby_streaming_delay expires, it will then slaughter everything and
take its lock.) In 9.6, it is obtained in the more conventional way, by
waiting while blocking newcomers who want a conflicting version. I don't
understand exactly why this change leads to the reported behavior, but it
is probably related.

but it is very likely
> to be the consequence of a standby conflict. You are setting
> max_standby_streaming_delay to 1200 seconds so the lock will be
> released after the same duration at maximum.
>
> The lock shown above almost certainly is a vacuum-truncation
> lock. Usually the lock is released soon by a following
> commit. However, if some other conflicting xlogs (for example
> vaccum cleanup records from a concurrent vacuum) comes before the
> commit the truncation lock won't be released until the conflict
> is resolved. This would be what you are looking.
>

Since no new sessions can connect once the pg_attributes table is locked,
this would either be a self-deadlock (which I think is a bug) or it would
require there to be long-lived connections to the standy-by which are
holding the locks which are blocking the vacuum cleanup .

Bruno, is this correct that you have long-lived transactions to the standby
while this is occurring with pg_attribute? Does it correct after 20
minutes, if you are willing to wait that long?

Cheers,

Jeff

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2017-07-11 19:39:14 Re: BUG #14738: ALTER SERVER for foregin servers not working
Previous Message jrogers 2017-07-11 15:11:28 BUG #14740: PgAdmin 4 Crashing when creating a PgAgent Schedule