Re: could not read block 77 of relation 1663/16385/388818775

From: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
To: Alexandra Nitzschke <an(at)clickware(dot)de>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: could not read block 77 of relation 1663/16385/388818775
Date: 2008-11-20 18:36:33
Message-ID: 4925AE31.90504@postnewspapers.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Alexandra Nitzschke wrote:
> Hi,
>
> we have had similar postgres problems in the past.
> Please have a look at Bug 3484.
>
> We didn't resolve the problems metioned in bug 3484. The other postgres
> developers also thought, that there are hardware
> problems.
> So our customer bought a new server with diffrent hardware configuration
> ( ... and NEW hardware drives ... ).
> The error today encountered on the new machine. Just running under heavy
> load since two days.

Yes, that does seem somewhat unlikely, especially if in both cases
you've only seen issues with PostgreSQL. However, I'm a bit confused
about the fact that you're seeing apparent corruption all over the place
- your earlier report mentions damaged blocks across a number of
relations, and this one is a bad index. You'd expect this sort of thing
to come up a lot on the list, so it must be assumed that there's
something a bit unusual or different about your configuration that's
either triggering a hard-to-hit bug in PostgreSQL, or that's damaging
PostgreSQL's data somehow.

Is there any chance you have EVER hard-killed the postmaster manually
(eg with "kill -9" or "kill -KILL")? If you do that and don't also kill
the backends, it's my understanding that BAD things may happen
especially if you then attempt to relaunch the postmaster.

Do you use _any_ 3rd party C extensions? Contrib modules? It doesn't
have to be in the same database, another database on the same machine
could be bad too.

Do you have any unusual workload? What is your workload like?

What procedural languages, if any, do you use? Pl/PgSQL? Pl/Perl?
Pl/Java? Pl/Python? etc. Again, in any database, not just your problem
one. If you use any other than Pl/PgSQL please also note the version of
the language interpreter/tools and in the case of Java the JVM vendor &
install method.

Does your site possibly have dodgy power? Are the servers on a UPS?

Have the servers had any crashes, kernel panics, unexpected reboots, or
hard poweroffs?

(Not that it should matter, but): Have you hard killed any backends
(kill -9 / SIGKILL)?

If you run a RAID verify using tw_cli or through the 3dm web interface,
does it report any block mismatches in the array?

--
Craig Ringer

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Heikki Linnakangas 2008-11-20 18:59:09 Re: could not read block 77 of relation 1663/16385/388818775
Previous Message Alexandra Nitzschke 2008-11-20 15:43:28 Re: could not read block 77 of relation 1663/16385/388818775