Re: [PG 8.1.0 / AIX 5.3] Vacuum processes freezing

From: RESTOUX, Loïc <loic(dot)restoux(at)capgemini(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: [PG 8.1.0 / AIX 5.3] Vacuum processes freezing
Date: 2007-06-20 15:27:52
Message-ID: F10D59BC922A9D47B7FE1B0A31AC7B53476917@CORPMAIL31.corp.capgemini.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi Tom, thanks for your reply,

> Have you looked into pg_locks to see if it's blocked on
> someone else's lock?

Yes, we looked into pg_locks and the vacuumdb process wasn't blocked. The table showed
four locks for vacuum, all with grant=true.

In fact, we found that a similar bug has been fixed in 8.1.1 :
> # Fix bgwriter problems after recovering from errors (Tom)
> The background writer was found to leak buffer pins after write errors.
> While not fatal in itself, this might lead to mysterious blockages of later VACUUM commands.
( http://www.postgresql.org/docs/8.1/static/release-8-1-1.html )

Can anyone confirm that the symptoms of this bug correspond to our problem ?
We saw some logs like :
<2007-06-11 12:44:04 DFT%>LOG: could not fsync segment 0 of relation 16391/16394/107912:
A system call received an interrupt.
<2007-06-11 12:44:04 DFT%>ERROR: storage sync failed on magnetic disk: A system call
received an interrupt.

Or :
<2007-06-16 12:25:45 DFT%>ERROR: could not open relation 16393/16394/107926: A system
call received an interrupt.
<2007-06-16 12:25:45 DFT%>CONTEXT: writing block 3 of relation 16393/16394/107926

But we can't see a relation between the fsync errors and the vacuum blockages. After a fsync error,
sometimes the vacuum works fine, sometimes it hangs. Is there any way to reproduce manually this
bug, in order to confirm that our problem is caused by this bug, and that it has been fixed
in the 8.1.9 for sure ? How can I find the patch for this bug in the source code ?

Regards,

--
Loic Restoux
Capgemini Telecom & Media / ITDR
tel : 02 99 27 82 30
e-mail : loic(dot)restoux(at)capgemini(dot)fr

This message contains information that may be privileged or confidential and is the property of the Capgemini Group. It is intended only for the person to whom it is addressed. If you are not the intended recipient, you are not authorized to read, print, retain, copy, disseminate, distribute, or use this message or any part thereof. If you receive this message in error, please notify the sender immediately and delete all copies of this message.

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Mikko Partio 2007-06-20 15:55:56 Re: Slow indexscan
Previous Message Tom Lane 2007-06-20 15:22:37 Re: Maintenance question / DB size anomaly...