|From:||Mariel Cherkassky <mariel(dot)cherkassky(at)gmail(dot)com>|
|Views:||Raw Message | Whole Thread | Download mbox|
I sent already an email about this topic to pgsql-admins but I think that
it might be more relevant to this mailing list. I'm trying to investigate a
corruption that happened on a machine of one of our clients.
A little background :
-os Centos 6.6
-I was asked to understand why the corruption happened and how can we fix
it. I copied the data dir to my machine on my local pc and started
investigate it. They realized that there is a problem when they tried to
backup the database via pg_dump. The error they got :
pg_dump: query returned 2 rows instead of one: SELECT tableoid, oid,
(SELECT rolname FROM pg_catalog.pg_roles WHERE oid = datdba) AS dba,
pg_encoding_to_char(encoding) AS encoding, datcollate, datctype,
datfrozenxid, (SELECT spcname FROM pg_tablespace t WHERE t.oid =
dattablespace) AS tablespace, shobj_description(oid, 'pg_database') AS
description FROM pg_database WHERE datname = 'db1'
So I checked and indeed there were duplicated rows in the pg_roles /
pg_database tables. Moreover, there were multiple values in all the system
The first think I did as Tom Lane suggested is to upgrade to v9.2.24.
Afterwards I vacuumed all the databases but nothing helped. I tried to
reindex the databases but I got the next error :
2018-08-28 21:50:03 +08 db2 24360 ERROR: could not access status of
2018-08-28 21:50:03 +08 db2 24360 DETAIL: Could not open file
"pg_subtrans/01EB": No such file or directory.
So I tried to analyze all the system catalogs manualy one by one :
select 'VACUUM ANALYZE pg_catalog.'||table_name from
information_schema.tables where table_schema = 'pg_catalog' and table_type
VACUUM ANALYZE pg_catalog.pg_statistic
VACUUM ANALYZE pg_catalog.pg_type
VACUUM ANALYZE pg_catalog.pg_authid
VACUUM ANALYZE pg_catalog.pg_proc
VACUUM ANALYZE pg_catalog.pg_class
VACUUM ANALYZE pg_catalog.pg_user_mapping
and I got error only on the next system table :
afa=# ANALYZE pg_catalog.pg_shdepend;
ERROR: could not access status of transaction 32212695
DETAIL: Could not open file "pg_subtrans/01EB": No such file or directory.
When I try to reindex the table I'm getting the same error.
Solutions I tried :
-vacuum all the databases
-reindex system db_name - > raise the same error (could not access
-upgrade to v9.2.24 (still got corruption).
-generate an empty subtransaction file with dd
-Trying to reindex system indexes in single mode -> raise same error.
*Tried to delete the duplicated values in some of the system tables by the
ctid value, but nothing is deleted.
Any idea how can I continue ?
|Next Message||Jonathan S. Katz||2018-08-28 14:38:06||Re: Something's busted in plpgsql composite-variable handling|
|Previous Message||David Steele||2018-08-28 14:27:09||Re: Would it be possible to have parallel archiving?|