could not access status of transaction 825832753

From: Stephen Tyler <stephen(at)stephen-tyler(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: could not access status of transaction 825832753
Date: 2009-12-07 09:53:18
Message-ID: 51549ea20912070153x47891932j19304201e4120bc7@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I just got this error, and I don't know why I got it:

7/12/09 2:57:24 PM org.postgresql.postgres[89] ERROR: could not
access status of transaction 825832753
7/12/09 2:57:24 PM org.postgresql.postgres[89] DETAIL: Could not open
file "pg_clog/0313": No such file or directory.
7/12/09 2:57:24 PM org.postgresql.postgres[89] STATEMENT: select
u.link, u.url from link_relurl as u left join link_meta as m on (m.link =
u.link) where u.url like 'http://www.somedomain.com/%' and released not in
(2,4) and url not like 'http://www.somedomain.com/blah%' order by
length(url) limit 200;

Retrying the SQL resulted in the same error.

I immediately ran pg_dump on the entire database. No errors reported.

I checked the disk volume. No problems found. No console message about
disk errors. SMART status is OK.

I quit psql, and then restarted psql and re-entered the SQL. The statement
succeeded.

I did "vacuum analyze <tablename>" on both the tables in the SQL statement.
No errors. Both tables are quite large (around 20GBytes).

I then did "select count(*) from link_relurl" and my Mac crashed hard
(multilingual grey-screen asking me to hold the power button down).

After reboot, "select count(*) from link_relurl" (and the other table)
succeeded.

pg_clog/ contains 146 files from 03F1 to 0482, so pg_clog/0313 is long gone.

I've searched past messages, and found references to disk corruption and
advice to rebuild the entire database. Is that still the advice? Is there
anyway to check that the database is not corrupted? Is running "vacuum
analyze" on a table enough to prove it is not corrupted?

My details:

Mac Pro 2009 Quad 2.93 with 16G of ECC RAM
Snow Leopard 10.6.2 in 64bit mode, fully patched
Database on RAID 0 array of SSDs
Postgres 8.4.1, 64 bit, compiled from source

I just installed Windows 7 in boot camp (on a different disk), and
rearranged the SATA cabling. But since the problem "disappeared" on reboot
I'm thinking the corruption, if any, was in RAM not on disk.

Stephen

Browse pgsql-general by date

  From Date Subject
Next Message Jayaraman, Rajaram (STSD) 2009-12-07 09:55:21 Error - could not get socket error status: Invalid argument
Previous Message A B 2009-12-07 07:00:49 Help! xlog flush request is not satisfied