Quick Links

Re: the un-vacuumable table

From:	"Andrew Hammond" <andrew(dot)george(dot)hammond(at)gmail(dot)com>
To:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Hackers <pgsql-hackers(at)postgresql(dot)org>, "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
Subject:	Re: the un-vacuumable table
Date:	2008-07-04 05:57:36
Message-ID:	5a0a9d6f0807032257l7217d1efx79453e06407774f3@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Thu, Jul 3, 2008 at 3:47 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> "Andrew Hammond" <andrew(dot)george(dot)hammond(at)gmail(dot)com> writes:
>> On Thu, Jul 3, 2008 at 2:35 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> The whole thing is pretty mystifying, especially the ENOSPC write
>>> failure on what seems like it couldn't have been a full disk.
>
>> Yes, I've passed along the task of explaining why PG thought the disk
>> was full to the sysadmin responsible for the box. I'll post the answer
>> here, when and if we have one.
>
> I just noticed something even more mystifying: you said that the ENOSPC
> error occurred once a day during vacuuming.

Actually, the ENOSPC happened once. After that first error, we got

vacuumdb: vacuuming of database "adecndb" failed: ERROR: failed to
re-find parent key in "ledgerdetail_2008_03_idx2" for deletion target
page 64767

repeatedly.

> That doesn't make any
> sense, because a write error would leave the shared buffer still marked
> dirty, and so the next checkpoint would try to write it again. If
> there's a persistent write error on a particular block, you should see
> it being complained of at least once per checkpoint interval.
>
> If you didn't see that, it suggests that the ENOSPC was transient,
> which isn't unreasonable --- but why would it recur for the exact
> same block each night?
>
> Have you looked into the machine's kernel log to see if there is any
> evidence of low-level distress (hardware or filesystem level)? I'm
> wondering if ENOSPC is being reported because it is the closest
> available errno code, but the real problem is something different than
> the error message text suggests. Other than the errno the symptoms
> all look quite a bit like a bad-sector problem ...

I will pass this along to the sysadmin in charge of this box.

In response to

Re: the un-vacuumable table at 2008-07-03 22:47:51 from Tom Lane

Responses

Re: the un-vacuumable table at 2008-07-07 19:00:12 from Andrew Hammond

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Raney	2008-07-04 07:07:49	Re: [PATCHES] Explain XML patch v2
Previous Message	Alvaro Herrera	2008-07-04 02:09:03	Re: Truncated queries when select * from pg_stat_activity - wishlist / feature request