Re: Weird disk/table space consumption problem

From: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
To: Dirk Riehle <dirk(at)riehle(dot)org>
Cc: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: Weird disk/table space consumption problem
Date: 2009-07-12 08:39:12
Message-ID: 1247387952.18105.19.camel@ayaki
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Sat, 2009-07-11 at 18:19 -0700, Dirk Riehle wrote:

> I do have some weird every few days error where the soft raid blocks for
> a couple of seconds and I get this kernel log output:
>
> Jul 7 19:58:55 server kernel: [40336.000239] ata1.00: status: { DRDY }
> Jul 7 19:58:55 server kernel: [40336.000244] ata1.00: cmd
> 61/08:a0:a7:44:21/00:00:00:00:00/40 tag 20 ncq 4096 out
> Jul 7 19:58:55 server kernel: [40336.000245] res
> 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)

Have you used smartctl (from the smartmontools package - on
Debian/Ubuntu at least) to examine the drive?

In particular, you should ask the drive to do a self-test and media
scan. This will not prevent take it out of the RAID or prevent it from
servicing normal operations, though it may slow it down a bit. Run:

smartctl -d ata -t long /dev/sda

then "sleep" however long it says the test will take, eg "sleep 2h".

When the sleep command exits, run:

smartctl -d ata -a /dev/sda

to see general info on the drive, its error logs, and its test logs. If
you see errors logged on the drive, if the test shows as failed, if you
see a non-zero "reallocated sector" count, or if "pending sector" is
non-zero, then it's time to replace the drive.

--
Craig Ringer

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Roy Walter 2009-07-12 09:23:53 xpath() subquery for empty array
Previous Message Craig Ringer 2009-07-12 08:28:15 Re: INSERT only unique records