Error on vacuum: xmin before relfrozenxid

From: Paolo Crosato <paolo(dot)crosato(at)gmail(dot)com>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Error on vacuum: xmin before relfrozenxid
Date: 2018-05-22 14:18:20
Message-ID: CAKMFJucqbuoDRfxPDX39WhA3vJyxweRg_zDVXzncr6+5wOguWA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-generalpgsql-hackers

Hi,

we have an error happening on a catalog table on one of the dbs in the
instance.
This is the error report form:

A description of what you are trying to achieve and what results you
expect.:

Any kind of vacuum fails on pg_authid table, I would expect it to succeed.
This is occasionaly blocking autovacuums, so we must resort to manual
vacuum of busy tables, which is bearable but inconvenient.

PostgreSQL version number you are running:

PostgreSQL 10.4 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5
20150623 (Red Hat 4.8.5-28), 64-bit

How you installed PostgreSQL:

From the pgdg yum repositories.

Changes made to the settings in the postgresql.conf file: see Server
Configuration for a quick way to list them all.

name |
current_setting | source
------------------------------+------------------------------------------------------------------------------+----------------------
application_name | psql
| client
archive_command | rsync -a %p
barman(at)x(dot)x(dot)x(dot)x:/data/backup/barman/ubipgsql-10/incoming/%f
| configuration file
archive_mode | on
| configuration file
checkpoint_completion_target | 0.9
| configuration file
checkpoint_timeout | 25min
| configuration file
client_encoding | UTF8
| client
DateStyle | ISO, MDY
| configuration file
default_text_search_config | pg_catalog.english
| configuration file
dynamic_shared_memory_type | posix
| configuration file
effective_cache_size | 64GB
| configuration file
hot_standby | on
| configuration file
hot_standby_feedback | on
| configuration file
lc_messages | en_US.UTF-8
| configuration file
lc_monetary | en_US.UTF-8
| configuration file
lc_numeric | en_US.UTF-8
| configuration file
lc_time | en_US.UTF-8
| configuration file
listen_addresses | *
| configuration file
log_autovacuum_min_duration | 1s
| configuration file
log_checkpoints | on
| configuration file
log_destination | syslog
| configuration file
log_directory | log
| configuration file
log_filename | postgresql-%a.log
| configuration file
log_line_prefix | user=%u,db=%d,client=%h
| configuration file
log_lock_waits | on
| configuration file
log_min_duration_statement | 1s
| configuration file
log_min_messages | info
| configuration file
log_rotation_age | 1d
| configuration file
log_rotation_size | 0
| configuration file
log_statement | ddl
| configuration file
log_temp_files | 1MB
| configuration file
log_timezone | Europe/Rome
| configuration file
log_truncate_on_rotation | on
| configuration file
logging_collector | on
| configuration file
maintenance_work_mem | 1GB
| configuration file
max_connections | 1000
| configuration file
max_stack_depth | 2MB
| environment variable
max_wal_size | 12GB
| configuration file
min_wal_size | 80MB
| configuration file
password_encryption | scram-sha-256
| configuration file
pg_stat_statements.max | 10000
| configuration file
pg_stat_statements.track | all
| configuration file
shared_buffers | 32GB
| configuration file
shared_preload_libraries | pg_stat_statements
| configuration file
syslog_facility | local0
| configuration file
syslog_ident | postgres
| configuration file
TimeZone | Europe/Rome
| configuration file
track_activity_query_size | 2048
| configuration file
track_functions | all
| configuration file
track_io_timing | on
| configuration file
wal_keep_segments | 200
| configuration file
wal_level | logical
| configuration file
(51 rows)

Operating system and version:

CentOS Linux release 7.4.1708 (Core)
Linux xx.xx.com 3.10.0-693.11.1.el7.x86_64 #1 SMP Mon Dec 4 23:52:40 UTC
2017 x86_64 x86_64 x86_64 GNU/Linux

What program you're using to connect to PostgreSQL:

psql, jdbc driver. It's not an application side error.

Is there anything relevant or unusual in the PostgreSQL server logs?:

Yes, we see these kind of errors:

May 20 16:06:04 ubipgsql01 postgres[26739]: [1380-1] user=,db=,client=
ERROR: found xmin 2889675859 from before relfrozenxid 400011439
May 20 16:06:04 ubipgsql01 postgres[26739]: [1380-2] user=,db=,client=
CONTEXT: automatic vacuum of table "postgres.pg_catalog.pg_authid"

When these errors pop up, autovacuum repeatedly fails on this very same
table, and we have to resort to manual vacuums

For questions about any kind of error:
What you were doing when the error happened / how to cause the error:

The first time we saw the error, we found an idle transaction was left open
for several days. The user was not monitored for long running transactions.
We killed the session, deleted the offending row (the one with the xmin
value reported) and then vacuumed the table.
However, teh same error pops up every time we try to add a new user. We
would like to fix these error once and for all, so we can add new users.

The EXACT TEXT of the error message you're getting, if there is one: (Copy
and paste the message to the email, do not send a screenshot)

This is the error:

psql (10.4)
Type "help" for help.

postgres=# vacuum pg_authid;
ERROR: found xmin 3031994631 from before relfrozenxid 400011439

Some datas:

postgres=# select xmin from pg_authid ;
xmin
------------
1
1
1
1
1
557
7216348
110077819
110511334
3031994631
3032044199
3032044199
3032044199
3032070282
(14 rows)

postgres=# select relfrozenxid from pg_class where relname='pg_authid';
relfrozenxid
--------------
400011439
(1 row)

postgres=#

Is this a sympthom of data corruption or transaction wraparound due to the
long running transaction that we killed weeks ago? This is the only table
in the whole cluster that has this error. We are monitoring transactions
wraparound with the the check_postgres.pl script, the check is still
running fine and no alert was given at all since the cluster has been
running.

Thank you in advance for any answer.

Paolo Crosato

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David G. Johnston 2018-05-22 14:22:20 Re: Fix for FETCH FIRST syntax problems
Previous Message Tom Lane 2018-05-22 14:09:02 Re: perl checking

Browse pgsql-general by date

  From Date Subject
Next Message Andres Freund 2018-05-22 16:49:11 Re: Error on vacuum: xmin before relfrozenxid
Previous Message Tom Lane 2018-05-22 13:58:16 Re: source of connection fails at pg startup?