Re: Help - corruption issue?

From: Phoenix Kiula <phoenix(dot)kiula(at)gmail(dot)com>
To: Filip Rembiałkowski <plk(dot)zuber(at)gmail(dot)com>
Cc: PG-General Mailing List <pgsql-general(at)postgresql(dot)org>
Subject: Re: Help - corruption issue?
Date: 2011-04-18 14:23:47
Message-ID: BANLkTikdkO5tpnOuvSSJyXF6A-AX5hAyHw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Thanks Filip.

I know which table it is. It's my largest table with over 125 million rows.

All the others are less than 100,000 rows. Most are in fact less than 25,000.

Now, which specific part of the table is corrupted -- if it is row
data, then can I dump specific parts of that table? How? Pg_dumpall
does not seem to have an option to have a "WHERE" clause?

If the lead index is corrupt, then issuing a reindex should work. So I
disconnected all other users. The DB was doing nothing. And then I
started a psql session and issued the command "reindex database MYDB".
After 3 hours, I see this error:

[QUOTE]
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: WARNING:
terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back
the current transaction and exit, because another server process
exited abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.
Failed.
!>
[/UNQUOTE]

What am I to do now? Even reindex is not working. I can try to drop
indexes and create them again. Will that help?

2011/4/18 Filip Rembiałkowski <plk(dot)zuber(at)gmail(dot)com>:
> Phoenix,
>
> how large (in total) is this database)?
>
> can you copy (cp -a) the data directory somewhere? I would do this
> just in case :-)
>
>
> regarding the manual recovery process:
>
> 1. you'll have to isolate corrupted table.
> you can do this by dumping all tables one-by-one (pg_dump -t TABLE)
> until you get the error.
>
> 2. find the record which is corupted... approach like this might work:
> select count(*) from the_corrupted_table where PK_column <= some_value.
>
> 3 .you should try to dump the table by chunks - skipping the corrupted
> row(s) if possible
>
> 4. if above method does not work, you can try manually hex-editing
> (zeroing) some bytes (with postgres shut down) to make dump work
> again.
>
>
> PS. obligatory note:
>
> 8.2.9 Release Date: 2008-06-12; 8.2.21 Release Date: 2011-04-18
> seems like you were running almost three years without bugfixes.
> aside from fixing your current problem, I would first do the upgrade
> to avoid more corruption.
>
>
>
>
>
>
> 2011/4/18 Phoenix Kiula <phoenix(dot)kiula(at)gmail(dot)com>
>>
>> While doing a PG dump, I seem to have a problem:
>>
>> ERROR: invalid memory alloc request size 4294967293
>>
>> Upon googling, this seems to be a data corruption issue!
>>
>> ( Came about while doing performance tuning as being discussed on the
>> PG-PERFORMANCE list:
>> http://postgresql.1045698.n5.nabble.com/REINDEX-takes-half-a-day-and-still-not-complete-td4005943.html
>> )
>>
>> One of the older messages suggests that I do "file level backup and
>> restore the data".
>> http://archives.postgresql.org/pgsql-admin/2008-05/msg00191.php
>>
>> How does one do this -- should I copy the data folder? What are the
>> specific steps?
>>
>> I'm on PG 8.2.9, CentOS 5, with 8GB of RAM. The disks are four SATAII
>> disks on RAID 1.
>>
>> Thanks!
>>
>> --
>> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-general
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Frank van Vugt 2011-04-18 14:37:16 REASSIGN OWNED doesn't for all objects, ALTER FUNCTION seems to fix it
Previous Message Karsten Hilbert 2011-04-18 13:53:16 problem with parent/child table and FKs