Re: BUG #3942: Related to BUG #2568 and BUG #2859: VACUUM process hangs and does not respond to the kill signals

From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To: "Valentine Gogichashvili" <valgog(at)gmail(dot)com>
Cc: <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #3942: Related to BUG #2568 and BUG #2859: VACUUM process hangs and does not respond to the kill signals
Date: 2008-02-08 18:11:39
Message-ID: 47AC9B5B.4090701@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Valentine Gogichashvili wrote:
> The following bug has been logged online:
>
> Bug reference: 3942
> Logged by: Valentine Gogichashvili
> Email address: valgog(at)gmail(dot)com
> PostgreSQL version: 8.2.3

You should upgrade to the latest 8.2.X minor release, regardless of this
issue...

> Operating system: Linux 2.6.8 SMP
> Description: Related to BUG #2568 and BUG #2859: VACUUM process hangs
> and does not respond to the kill signals
> Details:
>
> We are running several servers with the 'same' configuration.
>
> Recently two of many servers started to experience similar problems. It is
> quite difficult to reproduce. So I am providing the information that I have
> managed to get by now. The problem on the second server was to be seen only
> once. The problem on the first server is happening once a day or so. It does
> not seem to actually depend on the server load.
>
> The problem is normally noticed when the update sessions on one of the
> biggest table are hanging. The server has a 'statement_timeout' parameter
> set globally to 20 seconds and cronjob vacuum jobs are being run every 2
> hours like:
> "set statement_timeout to 0; VACUUM VERBOSE ANALYZE;", but never the less
> the update sessions are not killed for hours.

Sounds like you have a client connected to the database, but doing
nothing. statement_timeout won't do anything to a backend that isn't
actively running a query, even if it has a transaction open. You could
do "SELECT * FROM pg_stat_activity" and look for any backends in "idle
in transaction" state.

> I cannot kill VACUUM session explicitly as well.

How are you trying to kill it?

>
> More info:
>
> rumata(at)dbserver-04:~$ ps -ef | grep VACUUM | grep -v grep
> postgres 24359 21321 0 13:47 ? 00:00:12 postgres: postgres
> rumataindex [local] VACUUM
> rumata(at)dbserver-04:~$ sudo su - postgres
> postgres(at)dbserver-04:~$ psql rumataindex
> Welcome to psql 8.2.3, the PostgreSQL interactive terminal.
>
> Type: \copyright for distribution terms
> \h for help with SQL commands
> \? for help with psql commands
> \g or terminate with semicolon to execute query
> \q to quit
>
> rumataindex=# select * from pg_locks where pid = 24359;
> locktype | database | relation | page | tuple | transactionid |
> classid | objid | objsubid | transaction | pid | mode
> | granted
> ---------------+----------+----------+------+-------+---------------+-------
> --+-------+----------+-------------+-------+--------------------------+-----
> ----
> relation | 17717 | 18917 | | | |
> | | | 1045306585 | 24359 | ShareUpdateExclusiveLock | t
> relation | 17717 | 1892961 | | | |
> | | | 1045306585 | 24359 | RowExclusiveLock | t
> relation | 17717 | 94216 | | | |
> | | | 1045306585 | 24359 | RowExclusiveLock | t
> relation | 17717 | 94213 | | | |
> | | | 1045306585 | 24359 | RowExclusiveLock | t
> relation | 17717 | 94214 | | | |
> | | | 1045306585 | 24359 | RowExclusiveLock | t
> relation | 17717 | 94206 | | | |
> | | | 1045306585 | 24359 | RowExclusiveLock | t
> relation | 17717 | 185865 | | | |
> | | | 1045306585 | 24359 | RowExclusiveLock | t
> relation | 17717 | 952119 | | | |
> | | | 1045306585 | 24359 | RowExclusiveLock | t
> relation | 17717 | 94212 | | | |
> | | | 1045306585 | 24359 | RowExclusiveLock | t
> relation | 17717 | 94215 | | | |
> | | | 1045306585 | 24359 | RowExclusiveLock | t
> relation | 17717 | 94207 | | | |
> | | | 1045306585 | 24359 | RowExclusiveLock | t
> relation | 17717 | 185867 | | | |
> | | | 1045306585 | 24359 | RowExclusiveLock | t
> relation | 17717 | 94217 | | | |
> | | | 1045306585 | 24359 | RowExclusiveLock | t
> transactionid | | | | | 1045306585 |
> | | | 1045306585 | 24359 | ExclusiveLock | t
> relation | 17717 | 94208 | | | |
> | | | 1045306585 | 24359 | RowExclusiveLock | t
> relation | 17717 | 94211 | | | |
> | | | 1045306585 | 24359 | RowExclusiveLock | t
> (16 rows)
>
> rumataindex=# \q
> postgres(at)dbserver-04:~$ kill 24359
> postgres(at)dbserver-04:~$ psql rumataindex
> Welcome to psql 8.2.3, the PostgreSQL interactive terminal.
>
> Type: \copyright for distribution terms
> \h for help with SQL commands
> \? for help with psql commands
> \g or terminate with semicolon to execute query
> \q to quit
>
> rumataindex=# select * from pg_locks where pid = 24359;
> locktype | database | relation | page | tuple | transactionid |
> classid | objid | objsubid | transaction | pid | mode
> | granted
> ---------------+----------+----------+------+-------+---------------+-------
> --+-------+----------+-------------+-------+--------------------------+-----
> ----
> relation | 17717 | 18917 | | | |
> | | | 1045306585 | 24359 | ShareUpdateExclusiveLock | t
> relation | 17717 | 1892961 | | | |
> | | | 1045306585 | 24359 | RowExclusiveLock | t
> relation | 17717 | 94216 | | | |
> | | | 1045306585 | 24359 | RowExclusiveLock | t
> relation | 17717 | 94213 | | | |
> | | | 1045306585 | 24359 | RowExclusiveLock | t
> relation | 17717 | 94214 | | | |
> | | | 1045306585 | 24359 | RowExclusiveLock | t
> relation | 17717 | 94206 | | | |
> | | | 1045306585 | 24359 | RowExclusiveLock | t
> relation | 17717 | 185865 | | | |
> | | | 1045306585 | 24359 | RowExclusiveLock | t
> relation | 17717 | 952119 | | | |
> | | | 1045306585 | 24359 | RowExclusiveLock | t
> relation | 17717 | 94212 | | | |
> | | | 1045306585 | 24359 | RowExclusiveLock | t
> relation | 17717 | 94215 | | | |
> | | | 1045306585 | 24359 | RowExclusiveLock | t
> relation | 17717 | 94207 | | | |
> | | | 1045306585 | 24359 | RowExclusiveLock | t
> relation | 17717 | 185867 | | | |
> | | | 1045306585 | 24359 | RowExclusiveLock | t
> relation | 17717 | 94217 | | | |
> | | | 1045306585 | 24359 | RowExclusiveLock | t
> transactionid | | | | | 1045306585 |
> | | | 1045306585 | 24359 | ExclusiveLock | t
> relation | 17717 | 94208 | | | |
> | | | 1045306585 | 24359 | RowExclusiveLock | t
> relation | 17717 | 94211 | | | |
> | | | 1045306585 | 24359 | RowExclusiveLock | t
> (16 rows)
>
> rumataindex=# \q
>
> In addition, I want to mention, that "VACUUM VERBOSE ANALYZE;" session owned
> locks are the oldest ones, when the issue is occurring.
>
> The only way to restart the server is to force immediate server shutdown
> like:
>
> postgres(at)dbserver-04:~/pgsql/data$ pg_ctl -D . stop -m f
> waiting for server to shut
> down............................................................... failed
> postgres(at)dbserver-04:~/pgsql/data$ pg_ctl -D . stop -m f
> waiting for server to shut
> down............................................................... failed
> pg_ctl: server does not shut down
> postgres(at)dbserver-04:~/pgsql/data$ pg_ctl -D . stop -m i
> waiting for server to shut down.... done
> server stopped
> postgres(at)dbserver-04:~/pgsql/data$ pg_ctl -D . start
> server starting
>
> There are no errors or warning related to VACUUM to be noticed in the log
> files. Or any logging related to the sessions, that are still hanging.

Is there anything in the log indicating why it refuses to shut down?

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Arnaud 2008-02-08 18:42:53 BUG #3943: ecpg doesn't like "inet" operator
Previous Message Valentine Gogichashvili 2008-02-08 16:21:44 BUG #3942: Related to BUG #2568 and BUG #2859: VACUUM process hangs and does not respond to the kill signals