Re: found xmin from before relfrozenxid on pg_catalog.pg_authid

From: Maxim Boguk <maxim(dot)boguk(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, Jeremy Finzel <finzelj(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Postgres General <pgsql-general(at)postgresql(dot)org>
Subject: Re: found xmin from before relfrozenxid on pg_catalog.pg_authid
Date: 2018-05-22 19:42:08
Message-ID: CAK-MWwThWtcQYHCs7wwSnqiToHydaVnwkpHS537EAGojZ263Fg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On Tue, May 22, 2018 at 10:30 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:

> Hi,
>
> On 2018-05-22 22:18:15 +0300, Maxim Boguk wrote:
> > On Tue, May 22, 2018 at 9:47 PM, Andres Freund <andres(at)anarazel(dot)de>
> wrote:
> > > > select relfrozenxid from pg_class where relname='pg_authid';
> > > > relfrozenxid
> > > > --------------
> > > > 2863429136
>
> > ​select txid_current();
> > txid_current
> > --------------
> > 41995913769
>
> So that's an xid of 3341208114, if you leave the epoch out. What's
> ​​
> autovacuum_freeze_max_age set to in that cluster?
>

​postgres=# show autovacuum_freeze_max_age;
autovacuum_freeze_max_age
---------------------------
200000000
(default value I think)​

> Can you show pg_controldata output, and
> ​​
> relminmxid from that cluster?
>

​postgres(at)db:~$ /usr/lib/postgresql/9.6/bin/pg_controldata -D
/var/lib/postgresql/9.6/main
pg_control version number: 960
Catalog version number: 201608131
Database system identifier: 6469368654711450114
Database cluster state: in production
pg_control last modified: Tue 22 May 2018 10:20:14 PM MSK
Latest checkpoint location: CCB5/F9C37950
Prior checkpoint location: CCB0/43F316B0
Latest checkpoint's REDO location: CCB1/6706BD88
Latest checkpoint's REDO WAL file: 000000010000CCB100000067
Latest checkpoint's TimeLineID: 1
Latest checkpoint's PrevTimeLineID: 1
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID: 9:3341161759
Latest checkpoint's NextOID: 190071899
Latest checkpoint's NextMultiXactId: 59416233
Latest checkpoint's NextMultiOffset: 215588532
Latest checkpoint's oldestXID: 2814548646
Latest checkpoint's oldestXID's DB: 16400
Latest checkpoint's oldestActiveXID: 3341161759
Latest checkpoint's oldestMultiXid: 54264778
Latest checkpoint's oldestMulti's DB: 16400
Latest checkpoint's oldestCommitTsXid:2814548646
Latest checkpoint's newestCommitTsXid:3341161758
Time of latest checkpoint: Tue 22 May 2018 10:05:16 PM MSK
Fake LSN counter for unlogged rels: 0/1
Minimum recovery ending location: 0/0
Min recovery ending loc's timeline: 0
Backup start location: 0/0
Backup end location: 0/0
End-of-backup record required: no
wal_level setting: replica
wal_log_hints setting: on
max_connections setting: 2000
max_worker_processes setting: 8
max_prepared_xacts setting: 0
max_locks_per_xact setting: 64
track_commit_timestamp setting: on
Maximum data alignment: 8
Database block size: 8192
Blocks per segment of large relation: 131072
WAL block size: 8192
Bytes per WAL segment: 16777216
Maximum length of identifiers: 64
Maximum columns in an index: 32
Maximum size of a TOAST chunk: 1996
Size of a large-object chunk: 2048
Date/time type storage: 64-bit integers
Float4 argument passing: by value
Float8 argument passing: by value
Data page checksum version: 0

postgres=# select datname,datfrozenxid,datminmxid from pg_database order by
datname;
datname | datfrozenxid | datminmxid
-----------+--------------+------------
** | 2815939794 | 54265194
postgres | 2863429136 | 54280819
template0 | 3148297669 | 59342106
template1 | 2816765546 | 59261794

>
> I might be daft here, but it's surely curious that the relfrozenxid from
> the error and pg_catalog are really different (number of digits):
> catalog: 2863429136
> error: 248712603
>
>
> > ​About gdb bt - it's tricky because it is mission critical master db of
> > huge project.
> > I'll will try promote backup replica and check is issue persist there and
> > if yes - we will have our playground for a while, but it will require
> > sometime to arrange.​
>
> You should be ok to just bt that in the running cluster, but I
> definitely understand if you don't want to do that... I'd appreciate if
> you set up the a playground, because this seems like something that'll
> reappear.
>

​My gdb kunf-fu exceptionally rusty now, so I better play in sandbox.​
​And if error did't exist on fresh promoted replica it will give us useful
info as well.​

--
Maxim Boguk
Senior Postgresql DBA
http://dataegret.com/ <http://www.postgresql-consulting.com/>

Phone RU: +7 985 433 0000
Phone UA: +380 99 143 0000
Phone AU: +61 45 218 5678

LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b
Skype: maxim.boguk

"Доктор, вы мне советовали так не делать, но почему мне по-прежнему больно
когда я так делаю ещё раз?"

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Paolo Crosato 2018-05-22 19:43:01 Re: Error on vacuum: xmin before relfrozenxid
Previous Message Andres Freund 2018-05-22 19:30:07 Re: found xmin from before relfrozenxid on pg_catalog.pg_authid

Browse pgsql-hackers by date

  From Date Subject
Next Message Paolo Crosato 2018-05-22 19:43:01 Re: Error on vacuum: xmin before relfrozenxid
Previous Message Andres Freund 2018-05-22 19:30:07 Re: found xmin from before relfrozenxid on pg_catalog.pg_authid