Re: found xmin from before relfrozenxid on pg_catalog.pg_authid

From: Andres Freund <andres(at)anarazel(dot)de>
To: Jeremy Finzel <finzelj(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Postgres General <pgsql-general(at)postgresql(dot)org>
Subject: Re: found xmin from before relfrozenxid on pg_catalog.pg_authid
Date: 2018-03-19 18:17:23
Message-ID: 20180319181723.ugaf7hfkluqyos5d@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

Hi Jeremy, Alvaro,

On 2018-03-19 13:00:13 -0500, Jeremy Finzel wrote:
> On Mon, Mar 19, 2018 at 12:46 PM, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
> wrote:
>
> > Jeremy Finzel wrote:
> > > Getting some concerning errors in one of our databases that is on 9.5.11,
> > > on autovacuum from template0 database pg_authid and pg_auth_members. I
> > > only saw some notes on the list about this error related to materialized
> > > views. FWIW, we did use pg_upgrade to upgrade this database from 9.4 to
> > > 9.5. Here is an example:
> > >
> > > 2018-03-19 12:08:33.946 CDT,,,14892,,5aafee91.3a2c,1,,2018-03-19
> > 12:08:33
> > > CDT,59/340951,0,ERROR,XX001,"found xmin 2906288382 from before
> > relfrozenxid
> > > 740087784",,,,,"automatic vacuum of table
> > > ""template0.pg_catalog.pg_authid""",,,,""
> >
> > Can you please supply output of pg_controldata?

> Latest checkpoint's NextXID: 16/3132524419
> Latest checkpoint's NextMultiXactId: 2142
> Latest checkpoint's NextMultiOffset: 5235
> Latest checkpoint's oldestXID: 1829964553
> Latest checkpoint's oldestXID's DB: 12376
> Latest checkpoint's oldestActiveXID: 3131774441
> Latest checkpoint's oldestMultiXid: 1
> Latest checkpoint's oldestMulti's DB: 16400

Hm, based on these it doesn't look like multixacts were involved (based
on oldestMultiXid it's highly unlikley there've multi wraparound, and
there's not much multixact usage on system tables anyway). Which
suggests that there might have been actual corrpution here.

Jeremy:
- which version of 9.4 and 9.5 ran on this? Do you know?
- Can you install the pageinspect extension? If so, it might be a
CREATE EXTENSION pageinspect;
CREATE OR REPLACE FUNCTION check_rel(rel regclass, OUT blockno int8, OUT lp int2, OUT xmin xid)
RETURNS SETOF RECORD
LANGUAGE SQL
AS $$
SELECT blockno, lp, t_xmin
FROM
generate_series(0, pg_relation_size($1::text) / 8192 - 1) blockno, -- every block in the relation
heap_page_items(get_raw_page($1::text, blockno::int4)) -- every item on the page
WHERE
t_xmin IS NOT NULL -- filter out empty items
AND t_xmin != 1 -- filter out bootstrap
AND t_xmin != 2 -- filter out frozen transaction id
AND (t_infomask & ((x'0100' | x'0200')::int)) != ((x'0100' | x'0200')::int) -- filter out frozen rows with xid present
AND age(t_xmin) > age((SELECT relfrozenxid FROM pg_class WHERE oid = $1)) -- xid cutoff filter
$$;
SELECT * FROM check_rel('pg_authid') LIMIT 100;

and then display all items for one of the affected pages like
SELECT * FROM heap_page_items(get_raw_page('pg_authid', 34343));

Alvaro:
- Hm, we talked about code adding context for these kind of errors,
right? Is that just skipped for csvlog?
- Alvaro, does the above check_rel() function make sense?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2018-03-19 18:32:15 Re: PostgreSQL 9.6 Temporary files
Previous Message Jeremy Finzel 2018-03-19 18:00:13 Re: found xmin from before relfrozenxid on pg_catalog.pg_authid

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2018-03-19 18:40:54 Re: [PROPOSAL] Shared Ispell dictionaries
Previous Message Andres Freund 2018-03-19 18:07:17 Re: [PROPOSAL] Shared Ispell dictionaries