Re: the un-vacuumable table

From: "Andrew Hammond" <andrew(dot)george(dot)hammond(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Hackers <pgsql-hackers(at)postgresql(dot)org>, "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
Subject: Re: the un-vacuumable table
Date: 2008-07-07 19:00:12
Message-ID: 5a0a9d6f0807071200h4895ecd5m6eee060ab4ea2953@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jul 3, 2008 at 10:57 PM, Andrew Hammond
<andrew(dot)george(dot)hammond(at)gmail(dot)com> wrote:
> On Thu, Jul 3, 2008 at 3:47 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Have you looked into the machine's kernel log to see if there is any
>> evidence of low-level distress (hardware or filesystem level)? I'm
>> wondering if ENOSPC is being reported because it is the closest
>> available errno code, but the real problem is something different than
>> the error message text suggests. Other than the errno the symptoms
>> all look quite a bit like a bad-sector problem ...

da1 is the storage device where the PGDATA lives.

Jun 19 03:06:14 db1 kernel: mpt1: request 0xffffffff929ba560:6810
timed out for ccb 0xffffff0000e20000 (req->ccb 0xffffff0000e20000)
Jun 19 03:06:14 db1 kernel: mpt1: request 0xffffffff929b90c0:6811
timed out for ccb 0xffffff0001081000 (req->ccb 0xffffff0001081000)
Jun 19 03:06:14 db1 kernel: mpt1: request 0xffffffff929b9f88:6812
timed out for ccb 0xffffff0000d93800 (req->ccb 0xffffff0000d93800)
Jun 19 03:06:14 db1 kernel: mpt1: attempting to abort req
0xffffffff929ba560:6810 function 0
Jun 19 03:06:14 db1 kernel: mpt1: request 0xffffffff929bcc90:6813
timed out for ccb 0xffffff03e132dc00 (req->ccb 0xffffff03e132dc00)
Jun 19 03:06:14 db1 kernel: mpt1: completing timedout/aborted req
0xffffffff929ba560:6810
Jun 19 03:06:14 db1 kernel: mpt1: abort of req 0xffffffff929ba560:0 completed
Jun 19 03:06:14 db1 kernel: mpt1: attempting to abort req
0xffffffff929b90c0:6811 function 0
Jun 19 03:06:14 db1 kernel: mpt1: completing timedout/aborted req
0xffffffff929b90c0:6811
Jun 19 03:06:14 db1 kernel: mpt1: abort of req 0xffffffff929b90c0:0 completed
Jun 19 03:06:14 db1 kernel: mpt1: attempting to abort req
0xffffffff929b9f88:6812 function 0
Jun 19 03:06:14 db1 kernel: (da1:mpt1:0:0:0): WRITE(16). CDB: 8a 0 0 0
0 1 6c 99 9 c0 0 0 0 20 0 0
Jun 19 03:06:14 db1 kernel: (da1:mpt1:0:0:0): CAM Status: SCSI Status Error
Jun 19 03:06:14 db1 kernel: (da1:mpt1:0:0:0): SCSI Status: Check Condition
Jun 19 03:06:14 db1 kernel: (da1:mpt1:0:0:0): UNIT ATTENTION asc:29,0
Jun 19 03:06:14 db1 kernel: (da1:mpt1:0:0:0): Power on, reset, or bus
device reset occurred
Jun 19 03:06:14 db1 kernel: (da1:mpt1:0:0:0): Retrying Command (per Sense Data)
Jun 19 03:06:14 db1 kernel: mpt1: completing timedout/aborted req
0xffffffff929b9f88:6812
Jun 19 03:06:14 db1 kernel: mpt1: abort of req 0xffffffff929b9f88:0 completed
Jun 19 03:06:14 db1 kernel: mpt1: attempting to abort req
0xffffffff929bcc90:6813 function 0
Jun 19 03:06:14 db1 kernel: mpt1: completing timedout/aborted req
0xffffffff929bcc90:6813
Jun 19 03:06:14 db1 kernel: mpt1: abort of req 0xffffffff929bcc90:0 completed
Jun 19 03:06:14 db1 kernel: (da1:mpt1:0:0:0): WRITE(16). CDB: 8a 0 0 0
0 1 65 1b 71 a0 0 0 0 20 0 0
Jun 19 03:06:14 db1 kernel: (da1:mpt1:0:0:0): CAM Status: SCSI Status Error
Jun 19 03:06:14 db1 kernel: (da1:mpt1:0:0:0): SCSI Status: Check Condition
Jun 19 03:06:14 db1 kernel: (da1:mpt1:0:0:0): UNIT ATTENTION asc:29,0
Jun 19 03:06:14 db1 kernel: (da1:mpt1:0:0:0): Power on, reset, or bus
device reset occurred
Jun 19 03:06:14 db1 kernel: (da1:mpt1:0:0:0): Retrying Command (per Sense Data)
Jun 19 03:18:16 db1 kernel: mpt3: request 0xffffffff929d5900:56299
timed out for ccb 0xffffff03df7f5000 (req->ccb 0xffffff03df7f5000)

I think this is a smoking gun.

Andrew

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David E. Wheeler 2008-07-07 19:03:15 Re: PATCH: CITEXT 2.0
Previous Message Zdenek Kotala 2008-07-07 18:57:38 Re: PATCH: CITEXT 2.0 v2