BUG #14324: Deadlock? between queries and startup process on standby replica.

From: maxim(dot)boguk(at)gmail(dot)com
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #14324: Deadlock? between queries and startup process on standby replica.
Date: 2016-09-14 08:47:56
Message-ID: 20160914084756.1420.59188@wrigleys.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 14324
Logged by: Maksym Boguk
Email address: maxim(dot)boguk(at)gmail(dot)com
PostgreSQL version: 9.4.7
Operating system: Linux
Description:

Hi,

I found strange issue when one of standby replica stopped replaying wal.

During investigation of the issue I found the following facts:

1)startup process waiting for something:

ps -axuww | grep startup
postgres 1577 2.1 0.0 177360 76616 ? Ss Sep12 56:49 postgres:
startup process recovering 00000001000087480000002A waiting
(wal position doesn't advance at all)

2)all locks for pid=1577 are seems to be granted:

select * from pg_locks where pid=1577;
locktype | database | relation | page | tuple | virtualxid |
transactionid | classid | objid | objsubid | virtualtransaction | pid |
mode | granted | fastpath
------------+----------+-----------+------+-------+------------+---------------+---------+-------+----------+--------------------+------+---------------------+---------+----------
virtualxid | | | | | 1/1 |
| | | | 1/0 | 1577 | ExclusiveLock
| t | t
relation | 16424 | 166782199 | | | |
| | | | 1/0 | 1577 |
AccessExclusiveLock | t | f
relation | 16424 | 166782196 | | | |
| | | | 1/0 | 1577 |
AccessExclusiveLock | t | f
relation | 16424 | 17256 | | | |
| | | | 1/0 | 1577 |
AccessExclusiveLock | t | f

3)there are curios relations in previous pg_lock data (166782199 and
166782196) which aren't corresponding any entries in pg_class on target
database.
I checked pg_class on target replica, on master db and (as last attempt to
find whats going on) via pg_dump -F p -s --binary-upgrade dbname. Nada.

4)Every open transaction in this problem slave have (in pg_stat_activity)

waiting=true
state=active
and all of them stuck around the same time as startup process stuck.

5)Every open (and waiting transactions) on this slave have not granted
AccessShareLock on relation 17256 (which is expected because startup process
have AccessExclusiveLock on it):
locktype | database | relation | page | tuple | virtualxid | transactionid
| classid | objid | objsubid | virtualtransaction | pid | mode
| granted | fastpath
----------+----------+----------+------+-------+------------+---------------+---------+-------+----------+--------------------+-------+-----------------+---------+----------
relation | 16424 | 17256 | | | |
| | | | 167/838 | 376 | AccessShareLock
| f | f
relation | 16424 | 17256 | | | |
| | | | 170/503 | 377 | AccessShareLock
| f | f
relation | 16424 | 17256 | | | |
| | | | 173/162 | 1680 | AccessShareLock
| f | f
relation | 16424 | 17256 | | | |
| | | | 166/1558 | 1729 | AccessShareLock
| f | f
...
and so on.

6)Killing few individual oldest open transactions doesn't fix issue.

7)Full lock info for stuck transaction look like (they are the same by
structure of queries):
select * from pg_locks where pid=376 ;
locktype | database | relation | page | tuple | virtualxid |
transactionid | classid | objid | objsubid | virtualtransaction | pid |
mode | granted | fastpath
------------+----------+----------+------+-------+------------+---------------+---------+-------+----------+--------------------+-----+-----------------+---------+----------
relation | 16424 | 16937 | | | |
| | | | 167/838 | 376 | AccessShareLock |
t | t
relation | 16424 | 88781520 | | | |
| | | | 167/838 | 376 | AccessShareLock |
t | t
relation | 16424 | 18150 | | | |
| | | | 167/838 | 376 | AccessShareLock |
t | t
virtualxid | | | | | 167/838 |
| | | | 167/838 | 376 | ExclusiveLock |
t | t
relation | 16424 | 17256 | | | |
| | | | 167/838 | 376 | AccessShareLock |
f | f

8)strace for startup process (strace -p 1577) doesn't produce anything
useful... only endless:
select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout)

9)situation happened not the first time (may be once per month or so) after
creation of different FK on the master database.
This time it happened after running:
CREATE TABLE employer_vacancy_push_subscription (
...
user_id INTEGER NOT NULL REFERENCES hhuser (user_id),
vacancy_id INTEGER NOT NULL REFERENCES vacancy (vacancy_id),
...
);
Table hhuser have OID=17256 (this lock not granted for all waiting
transactions)
Table vacancy have OID=18150 (which granted for all waiting transactions),
but (possible) not granted for startup process.

10)Even killing the all waiting transaction doesn't help, because in same
instant more same transactions will be open (it's quite heavy load even if
non-interactive DB), and stuck the startup process in the same place
again.
Only manual pausing all activity via pgbouncer and killing all stuck
transactions (to let startup process progress this wal record) fixing
problem.

Possible (but likely) explanation of what's going on:
startup process attempt to replay wal record for create table, which require
AccessExclusiveLock on hhuser (OID=17256) which he successfully get,
but once startup process attempt grab a AccessExclusiveLock on vacancy
(OID=18150) it could not get it because it already grabbed by concurrent
transactions
(which grab AccessShareLock locks in opposite order first on the vacancy and
only after on the hhuser).
But the concurrent transactions could not progress because they cannot grab
AccessShareLock on the hhuser (it already locked by startup process).

As a result the database have undetected deadlock between startup process
and queries which doesn't resolve automatically (and not visible on pg_locks
in clean way).

Kind Regards,
Maksym

Browse pgsql-bugs by date

  From Date Subject
Next Message Thomas Berger 2016-09-14 09:22:22 Re: [BUGS] BUG #14244: wrong suffix for pg_size_pretty()
Previous Message Andreas Papst 2016-09-14 07:12:14 Re: BUG #14320: systemd terminates postgresql hot standby instance 90 seconds after start