Possible data corruption

From: Martijn Meijer <martijn(dot)meijer(at)fuga(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: Possible data corruption
Date: 2015-08-28 11:14:55
Message-ID: CAOVf_d=hscUrGjB=yy-GvKgqOJqqcuwmrqg7QUzXx4W21puLeQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi all,

I'm having weird issues with my Postgres installation, possibly a
corruption. The last backup I have if from a day and a half before, so
ideally I'd like to restore the data as-is.

I have made a full file system-level copy of the related data, as
instructed at https://wiki.postgresql.org/wiki/Corruption .

I was told I should provide the following:

A description of what you are trying to achieve and what results you
expect.:

Any query on some tables fail, including simple ones like: select count(*)
from contracts;

Gives:

ERROR: could not access status of transaction 552079857
DETAIL: Could not open file "pg_multixact/members/60D4": No such file or
directory.

PostgreSQL version number you are running:

9.3.4
​. This was the initial version installed on this machine, but the data was
previously on a different machine. A normal export (i.e., no -Fc or similar
was passed to pg_dump) was imported.​

How you installed PostgreSQL:

Added postgres servers to apt sources, apt-get install

Changes made to the settings in the postgresql.conf file:

max_connections = 500
superuser_reserved_connections = 3
shared_buffers = 512MB
temp_buffers = 8MB
work_mem = 5MB
maintenance_work_mem = 16MB
wal_buffers = 8MB
checkpoint_segments = 32
checkpoint_completion_target = 0.9
seq_page_cost = 1.0
random_page_cost = 1.2
effective_cache_size = 1536MB
log_min_messages = info
log_min_duration_statement = 5000
log_checkpoints = on
autovacuum_naptime = 5min

Operating system and version:

Ubuntu Lucid (10.04) 64 bit

What program you're using to connect to PostgreSQL:

psql on the command line

Is there anything relevant or unusual in the PostgreSQL server logs?:

The messages about "Could not open file" started appearing last night at
19:00. I don't see any other relevant messages.

The EXACT TEXT of the error message you're getting, if there is one:

ERROR: could not access status of transaction 552079857
DETAIL: Could not open file "pg_multixact/members/60D4": No such file or
directory.

​Hardware details:​

​CPU: 2x Intel(R) Xeon(R) CPU E5-2603
RAM: 16 GB
Storage: 2x INTEL SSDSC2BW48 in mdraid 1 (2 other Intel SSD's present)

$ modinfo raid1
filename: /lib/modules/3.0.0-26-server/kernel/drivers/md/raid1.ko
alias: md-level-1
alias: md-raid1
alias: md-personality-3
description: RAID1 (mirroring) personality for MD
license: GPL
srcversion: 2AAEFFAAADEDE0EDEE8D523
depends:
vermagic: 3.0.0-26-server SMP mod_unload modversions

fsync=off was never used.
We did do a partition resize 2 weeks back (followed
https://raid.wiki.kernel.org/index.php/Growing ) of the parition containing
the postgres files.

What I already
​tried
:

- Restarted PostgreSQL
​- fsck​.ext4 -fr (returned no results)
- vacuum analyze; (returns with the same error)
- From the 9.3.5 release notes:

postgres=# WITH list(file) AS (SELECT * FROM
pg_ls_dir('pg_multixact/offsets'))
postgres-# SELECT EXISTS (SELECT * FROM list WHERE file = '0000') AND
postgres-# NOT EXISTS (SELECT * FROM list WHERE file = '0001') AND
postgres-# NOT EXISTS (SELECT * FROM list WHERE file = 'FFFF') AND
postgres-# EXISTS (SELECT * FROM list WHERE file != '0000')
postgres-# AS file_0000_removal_required;
file_0000_removal_required
----------------------------
f
(1 row)

​Thanks so much in advance for helping out!

Martijn Meijer​

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2015-08-28 13:58:46 Re: [BUGS] Compile fails on AIX 6.1
Previous Message ekocjan 2015-08-28 10:46:58 BUG #13594: pg_ctl.exe redirects stderr to Windows Events Log if stderr is redirected to pipe