BUG #13970: Vacuum hangs on particular table; cannot be terminated - requires `kill -QUIT pid`

From: brian(at)pukkasoft(dot)com
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #13970: Vacuum hangs on particular table; cannot be terminated - requires `kill -QUIT pid`
Date: 2016-02-18 10:56:33
Message-ID: 20160218105633.2660.10202@wrigleys.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 13970
Logged by: Brian Ghidinelli
Email address: brian(at)pukkasoft(dot)com
PostgreSQL version: 9.4.6
Operating system: Linux (RHEL 5.11)
Description:

Hi Pg team - I've been running a 9.4.1 server for the last year+. In the
past few months I've had a couple of instances of the server locking up.
I've done more troubleshooting into this last event and uncovered what
appears to be a bug. My situation is much like these:

http://comments.gmane.org/gmane.comp.db.postgresql.admin/40587
http://postgresql.nabble.com/VACUUM-hanging-on-PostgreSQL-8-3-1-for-larger-tables-td1898438.html

The former claims lightweight locks had a bug up thru 9.4.5 but I'm running
the latest 9.4.6 and still experiencing this issue with one table. Here's
the scenario:

* Either autovacuum OR manual vacuum on a single table hangs
* There is no cpu or i/o usage; top -p <pid> shows the vacuum process is
sleeping
* strace of the process id shows rapidly scrolling screenfuls of `select(0,
NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)`
* Running a query against pg_locks shows the vacuum has been granted a lock
but it is not fast path.
* There are no other queries running... I can trigger this behavior after a
fresh reboot and no other users by issuing a simple vacuum.
* I have reindex'd the table as well as dropped all but the primary key in
case there were issues with the index - still hung when vacuum was
attempted
* Interestingly when I check the last autovacuum/autoanalyze report, they
are all blank, even though I have autovacuum on

It scares me a lot that pg_cancel_backend and pg_terminate_backend don't
work. It requires a kill -QUIT to break out this process.

What can I investigate to help add more information?

Brian

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Ruxandra Durus 2016-02-18 12:33:12 Re: BUG #13964: unexpected result from to_tsvector
Previous Message Artur Zakirov 2016-02-18 10:53:43 Re: BUG #13964: unexpected result from to_tsvector