Quick Links

Postgres Query Plan Live Lock

From:	"Pweaver (Paul Weaver)" <pweaver(at)panjiva(dot)com>
To:	pgsql-performance(at)postgresql(dot)org
Cc:	Michael Vezza <michael(at)panjiva(dot)com>
Subject:	Postgres Query Plan Live Lock
Date:	2014-02-03 21:35:43
Message-ID:	CAFTGa=mvkpL_mOi4NT4LbB3E1QnqVfQ+QQfgSJBfZ1WHujnvpA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

We have been running into a (live lock?) issue on our production Postgres
instance causing queries referencing a particular table to become extremely
slow and our application to lock up.

This tends to occur on a particular table that gets a lot of queries
against it after a large number of deletes. When this happens, the
following symptoms occur when queries referencing that table are run (even
it we stop the deleting):

SELECT * FROM table_name LIMIT 10; -- takes ~45 seconds to complete
EXPLAIN SELECT * FROM table_name LIMIT 10; -- takes ~45 seconds to
complete the explain query, the query plan looks reasonable
EXPLAIN SELECT * FROM table_name LIMIT 10; -- takes ~45 seconds to
complete the explain analyze query, query plan looks reasonable, timing
stats says query took sub millisecond time to complete

SELECT * FROM another_table LIMIT 10; -- takes sub millisecond time
EXPLAIN * FROM another_table LIMIT 10; -- takes sub millisecond time, query
plan looks reasonable

This behavior only stops and the queries go back to taking sub millisecond
time if we take the application issuing the SELECTs offline and wait for
the active queries to finish (or terminate them).

There is not a particularly large load on the database machine at the time,
neither are there a particularly large number of wal logs being written
(although there is a burst of wal log writes immediately after the queue is
cleared).

table_name stats:
~ 400,000,000 rows
We are deleting 10,000,000s of rows in 100,000 row increments over a few
days time prior/during this slowdown.
Simultaneously a web app is querying this table continuously.

table_name has 4 btree indexes on it (one of which is set to CLUSTER) and
one foreign key constraint.

The obvious workaround is to not delete so much data on the table on our
production database, but we would like to figure out why Postgres is live
locking this table. Do you have any ideas why this is happening and how to
prevent it while still doing mass deletes on the table?

-------------------------------------------------------------------------

System information:

Postgres Version - PostgreSQL 9.2.3 on x86_64-unknown-linux-gnu, compiled
by gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3, 64-bit
OS - Ubuntu 12.04 LTS

Autovacuum is on.

--------------------------------------------------------------------------

--
Thank You,
Pweaver (pweaver(at)panjiva(dot)com)

Responses

Re: Postgres Query Plan Live Lock at 2014-02-05 02:03:43 from Peter Geoghegan
Postgres Query Plan Live Lock at 2014-02-05 14:52:20 from Jeff Janes
Re: Postgres Query Plan Live Lock at 2014-02-05 16:39:45 from bricklen

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Jeff Janes	2014-02-05 00:14:51	Re: [PERFORM] encouraging index-only scans
Previous Message	Tom Lane	2014-02-03 21:03:32	Re: Planner estimates and VACUUM/autovacuum