Re: the un-vacuumable table

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Andrew Hammond" <andrew(dot)george(dot)hammond(at)gmail(dot)com>
Cc: Hackers <pgsql-hackers(at)postgresql(dot)org>, "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
Subject: Re: the un-vacuumable table
Date: 2008-07-03 22:47:51
Message-ID: 10499.1215125271@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Andrew Hammond" <andrew(dot)george(dot)hammond(at)gmail(dot)com> writes:
> On Thu, Jul 3, 2008 at 2:35 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> The whole thing is pretty mystifying, especially the ENOSPC write
>> failure on what seems like it couldn't have been a full disk.

> Yes, I've passed along the task of explaining why PG thought the disk
> was full to the sysadmin responsible for the box. I'll post the answer
> here, when and if we have one.

I just noticed something even more mystifying: you said that the ENOSPC
error occurred once a day during vacuuming. That doesn't make any
sense, because a write error would leave the shared buffer still marked
dirty, and so the next checkpoint would try to write it again. If
there's a persistent write error on a particular block, you should see
it being complained of at least once per checkpoint interval.

If you didn't see that, it suggests that the ENOSPC was transient,
which isn't unreasonable --- but why would it recur for the exact
same block each night?

Have you looked into the machine's kernel log to see if there is any
evidence of low-level distress (hardware or filesystem level)? I'm
wondering if ENOSPC is being reported because it is the closest
available errno code, but the real problem is something different than
the error message text suggests. Other than the errno the symptoms
all look quite a bit like a bad-sector problem ...

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Russell Smith 2008-07-03 23:07:30 Re: libpq does not manage SSL callbacks properly when other libraries are involved.
Previous Message Dave Page 2008-07-03 22:16:38 Re: CommitFest rules