Re: Notice and share memory corruption

From: Hannu Krosing <hannu(at)tm(dot)ee>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Notice and share memory corruption
Date: 2000-09-18 08:17:57
Message-ID: 39C5CFB5.B27029CA@tm.ee
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:
>
> Hannu Krosing <hannu(at)tm(dot)ee> writes:
> > I get the following on untuned Linux (Redhat 6.2) using stock 7.0.2
> > rpm-s
>
> > NOTICE: RegisterSharedInvalid: SI buffer overflow
> > NOTICE: InvalidateSharedInvalid: cache state reset
>
> > Actually I get many of them ;(
>
> AFAIK, these are just noise in 7.0. The only reason you see them is
> we haven't got round to removing the messages or downgrading them to
> elog(DEBUG).
>
> > I'm running a script that does a bunch of mixed INSERTS, UPDATES,
> > DELETES and SELECTS.
>
> I'll bet you also have some backends sitting idle with open
> transactions? The combination of idle and active backends is what
> usually provokes SI overruns.
>
> > after getting that I'm unable to vacuum database until I reset the OS
>
> Define your terms more carefully, please. What do you mean by
> "unable to vacuum" --- what happens *exactly*?

NOTICE: FlushRelationBuffers(access_right, 2009): block 1944 is
referenced (private 0, global 2)
FATAL 1: VACUUM (vc_repair_frag): FlushRelationBuffers returned -2
pqReadData() -- backend closed the channel unexpectedly.
This probably means the backend terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.

> In any case,
> surely it doesn't take an OS reboot to recover. I might believe
> you need to restart the postmaster...

on one machine a simple restart worked

Maybe i have to really restart it (instead of doing
/etc/rc.d/init.d/postgresql restart)
by running killall -9 /usr/bin/postgres

I was quite sure that just restarting it did not help, but maybe
it really did not restart, just claimed to .

On the other I still get

amphora2=# vacuum;
NOTICE: FlushRelationBuffers(item, 30): block 2 is referenced (private
0, global 1)
FATAL 1: VACUUM (vc_repair_frag): FlushRelationBuffers returned -2
pqReadData() -- backend closed the channel unexpectedly.
This probably means the backend terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.

after stopping postmaster (and checking it is stopped)

I could do a vacuum after restarting the whole machine...

OTOH it _may_ be that someone started another backend right after
restart and did something,
but must this be a FATAL error ?

-----------
Hannu

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2000-09-18 08:56:13 Re: broken locale in 7.0.2 without multibyte support (FreeBSD 4.1-RELEASE) ?
Previous Message Zeugswetter Andreas SB 2000-09-18 08:16:59 AW: AW: "setuid" functions, a solution to the RI privil ege problem