Re: FSM corruption leading to errors

From: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FSM corruption leading to errors
Date: 2016-10-20 05:11:06
Message-ID: CABOikdNKB706ASToj5jowqzQaVK6OvUsqPfw7i6VpKjGuRBYmQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Oct 19, 2016 at 6:54 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

>
>
> Can we document an existing procedure for repairing FSM corruption?
> (VACUUM, maybe?)

I'm afraid it may not be easy to repair the corruption with existing
facilities. Most often the corruption will be on the standby and a VACUUM
may not actually touch affected pages on the master (because they may not
even exists on the master or skipped because of visibility maps). It may
not even trigger relation truncation. So AFAICS it may not generate any WAL
activity that can fix the corruption on the standby.

One possible way would be to delete the FSM (and VM) information on the
master and standby and then run VACUUM so these maps are rebuilt. We
obviously don't need to do this for all tables, but we need a way to find
the tables with corrupt FSM [1].

Suggested procedure could be:

1. Upgrade master and standby to the latest minor release (which involves
restart)
2. Install pg_freespace extension and run the query [1] on master to find
possible corruption cases. The query checks if FSM reports free space in a
block outside the size of the relation. Unfortunately, we might have false
positives if the relation is extended while the query is running.
3. Repeat the same query on standby (if it's running in Hot standby mode,
otherwise the corruption can only be detected once it's promoted to be a
master)
4. Remove FSM and VM files for the affected tables (I don't think if it's
safe to do this on a running server)
5. VACUUM affected tables so that FSM and VM is rebuilt.

Another idea is to implement a pg_freespace_repair() function in
pg_freespace which takes an AccessExclusiveLock on the table and truncates
it to it's current size, thus generating a WAL record that the standby will
replay to fix the corruption. This probably looks more promising, easy to
explain and less error prone.

[1] SELECT *
FROM (
SELECT oid::regclass as relname, EXISTS (
SELECT *
FROM (
SELECT blkno, pg_freespace(oid::regclass, blkno)
FROM
generate_series(pg_relation_size(oid::regclass)/
current_setting('block_size')::bigint, pg_relation_size(oid::regclass,
'fsm') / 2) as blkno
) as avail
WHERE pg_freespace > 0
) as corrupt_fsm
FROM pg_class
WHERE relkind = 'r'
) b
WHERE b.corrupt_fsm = true;

Thanks,
Pavan

--
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2016-10-20 05:14:13 Re: Password identifiers, protocol aging and SCRAM protocol
Previous Message Amit Kapila 2016-10-20 04:37:22 Re: Parallel bitmap heap scan