Segmentation fault ( after Shared Lock acquired, same table 4 times )

From: Jan Mussler <janm81(at)googlemail(dot)com>
To: pgsql-admin(at)postgresql(dot)org
Subject: Segmentation fault ( after Shared Lock acquired, same table 4 times )
Date: 2012-04-02 10:24:30
Message-ID: CAKZTnzWTmOcCxy=Gmwf3cC_4UennuAYZQ-E61VHOk7CYkSCX8Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Dear all,

yesterday evening we ran into some trouble on one of our databases running
PostgreSQL 9.0.4, AMD Opteron Setup, ECC Memory, SAN Storage.

This database has performed very well for month without any troubles with
fsync on, full page writes on and streaming to our replica slaves.

Then yesterday PostgreSQL reported a segmentation fault "server process
(PID 19531) was terminated by signal 11: Segmentation fault" and shutdown
all processes and successfully reinitialized.

This happend immediatly after: process 19531 acquired ShareLock on
transaction 3193803679 after 2277.230 ms",,,,,"SQL statement ""UPDATE ONLY
xxxx.parent_table

The same error occured 1 minute later, same table, same "update only"
statement.

Again all processes were shutdown and reinitialize successfull.

At this point we decided not to switch, because we considered PostgreSQL's
decission to restart itself safe ( believing in conservative and safe
decissions by developers and a good implementation of shutdown and recovery
) and considering we were already running for minutes while discussing our
options.

Roughly 3 hours later, same error, same table, again segmentation fault.
again twice within minutes.

This time we stopped manually, upgraded to 9.0.7 and restartet again. Still
believing that our setup and the shutdown plus recovery prevents us from
data corruption.

Vacuum later reported the following: 012-03-31 00:21:25
CEST,124/3728,0,WARNING,01000,"relation ""xxxx.child_table"" page 912002 is
uninitialized --- fixing",,,,,,,,,""

We renamed the parent table and its child, created identical new ones and
moved the relevant data into the new tables.

Now we are running 8 hours without any problem.

I am sharing this, because I am looking for similar experiences or bad and
good outcomes of similar errors. Especially since we ran into the same
error on the same table 4 times, which is at least 3 time to many if not 4
times.

Thanks for your input!

Best Regards,
Jan

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Steve Crawford 2012-04-02 15:55:24 Re: about encoding
Previous Message Scott Whitney 2012-03-31 22:46:17 Re: PGadmin PostgresSQL