BUG #5465: dblink TCP connection hangs blocking translation from being terminated

From: "Valentine Gogichashvili" <valgog(at)gmail(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #5465: dblink TCP connection hangs blocking translation from being terminated
Date: 2010-05-19 09:10:29
Message-ID: 201005190910.o4J9AT3A063957@wwwmaster.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs


The following bug has been logged online:

Bug reference: 5465
Logged by: Valentine Gogichashvili
Email address: valgog(at)gmail(dot)com
PostgreSQL version: 8.2.1
Operating system: Red Hat 3.4.6-3 (kernel 2.6.9-42.0.3.ELsmp)
Description: dblink TCP connection hangs blocking translation from
being terminated
Details:

Hi all,

we have an issue on our productive server. A stored procedure, that uses
dblink to get some data from the remote database hangs not responding to
kill signal and holds several locks on some tables as well as an advisory
lock. So I have this transaction to be completed in order to have a
possibility to operate the database normally.

It was exactly on the time, that the procedure was accessing remote
database, the machine hosting this remote database had a panic attack and
rebooted. But the ESTABLISHED connection is still hanging on the production
database machine:

$ netstat | grep remote_db_host
tcp 0 0 production_db_host:60248 remote_db_host:postgres ESTABLISHED

$ lsof | grep remote_db_host
postgres 1365 postgres 199u IPv4 23003779784 TCP
production_db_host:60248->remote_db_host:postgres (ESTABLISHED)

On the database session list one can see the hanging transaction:

production_db=# select procpid, now() - query_start as running, waiting,
substr(current_query,1,120) as current_query from pg_stat_activity where
current_query not like '%----STATQ-----%' and current_query != '<IDLE>'
order by query_start desc;
procpid | running | waiting |
current_query
---------+------------------------+---------+-------------------------------
----------------------------------------------------------------------------
------------------------------------
1365 | 2 days 00:17:57.992004 | f | SELECT * FROM
get_remote_data()

It seems like the dblink is waiting for the connection to be closed or
reseted and also makes the hole transaction hang not processing kill
signals.

Does the dblink TCP connection have any timeout?

How would it be possible to shutdown the DB in case this session process is
not responding to normal kill signals? Will it hinder the database from
shutting down normally? My previous experience with issuing immediate stops
or killing with -9 had been quite catastrophic and I could not start the DB
afterwards. What would you suggest in this case?

With best regards,

-- Valentine Gogichashvili

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Magnus Hagander 2010-05-19 15:17:19 Re: BUG #5465: dblink TCP connection hangs blocking translation from being terminated
Previous Message Michael Enke 2010-05-19 08:53:12 BUG #5464: ecpg on 64bit system converts "long long" to "long"