BUG #15840: Vacuum does not work after database stopped for wraparound protection. Database seems unrepearable.

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: thusson(at)informiciel(dot)com
Subject: BUG #15840: Vacuum does not work after database stopped for wraparound protection. Database seems unrepearable.
Date: 2019-06-07 18:22:20
Message-ID: 15840-06c9565bfd8099f6@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

The following bug has been logged on the website:

Bug reference: 15840
Logged by: Thierry Husson
Email address: thusson(at)informiciel(dot)com
PostgreSQL version: 12beta1
Operating system: Ubuntu 18.04.2 LTS
Description:

I was doing tables COPY between my old server with PG10.8 and the new one
with 12Beta1. After each table is done, I make a vacuum of it.
However PG12 has stopped working for wraparound protection. I was doing it
on around 10 cpu, 1 table by cpu.
The is the end of the log of the copy program in Python3 with Psycopg2:
...
2019-06-06 23:15:26 prog_sync Vacuum
usr_ops.prg_hrdps_n1500n_voisin_ade_metar... 4s.
2019-06-06 23:15:30 prog_sync Vacuum
usr_ops.flt_hrdps_n1500n_voisin_ade_metar... 0s.
2019-06-06 23:15:30 prog_sync CPU 0 - Sync done 8857sec.
2019-06-06 23:15:30 prog_sync Tables Skipped:0, Already sync:0, Copied
from pravda:1. Copied from zhen:0.
2019-06-06 23:15:30 prog_sync Sync done for 1 tables of 106451311 records in
8858s. (12018 rec./sec.)

Traceback (most recent call last):
File "/home/semt700/emet/script/prog_sync.py", line 316, in syncTable
ioResult = e.flushCopyBuffer(ioResult, curPG[slave][procId],
progTable[slave], columns)
File "/fs/home/fs1/eccc/cmd/cmdn/semt700/emet/script/emetlib.py", line
607, in flushCopyBuffer
cursorObj.copy_from(ioBuffer, tableName, sep='\t', columns=columnName,
null='NULL')
psycopg2.OperationalError: database is not accepting commands to avoid
wraparound data loss in database "emet_zhen"
HINT: Stop the postmaster and vacuum that database in single-user mode.
You might also need to commit or roll back old prepared transactions, or
drop stale replication slots.
CONTEXT: SQL statement "INSERT INTO
usr_ops.prg_gdps_g1610n_voisin_ade_synop_swob_metar_201903 SELECT $1.*"
PL/pgSQL function prog_insert() line 17 at EXECUTE
COPY prg_gdps_g1610n_voisin_ade_synop_swob_metar, line 132822: "284532738
2019-03-20 00:00:00 2019-03-29 12:00:00 11011 37980000
-101750000 75597472 NULL -5.4617 1 ..."

I did a DB shutdown and started a vacuum with:
postgres --single emet_zhen
VACUUM FREEZE VERBOSE;

It worked a few hours and when I was thinking it was done as nothing was
loggin anymore, I made a ctrl-\ and restarted the DB.
I was still getting wraparound protection messages so I shutdown the DB
again & redo the vacuum command but it didn't work anymore:
zhen:semt700 $ postgres --single emet_zhen
2019-06-07 17:23:36 UTC 7251 WARNING: database with OID 16394 must be
vacuumed within 999995 transactions
2019-06-07 17:23:36 UTC 7251 HINT: To avoid a database shutdown, execute a
database-wide VACUUM in that database.
You might also need to commit or roll back old prepared
transactions, or drop stale replication slots.
PostgreSQL stand-alone backend 12beta1
backend> VACUUM VERBOSE;
2019-06-07 17:23:59 UTC 7251 WARNING: database "emet_zhen" must be
vacuumed within 999995 transactions
2019-06-07 17:23:59 UTC 7251 HINT: To avoid a database shutdown, execute a
database-wide VACUUM in that database.
You might also need to commit or roll back old prepared
transactions, or drop stale replication slots.
2019-06-07 17:23:59 UTC 7251 LOG: duration: 2417.639 ms statement: VACUUM
VERBOSE;

I tried with various options but none worked. It also tried to restard the
DB and use vacuumdb --all -v , or various options, but always get the same
message for each table:

INFO: aggressively vacuuming "pg_catalog.pg_publication"
INFO: index "pg_publication_oid_index" now contains 0 row versions in 1
pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO: index "pg_publication_pubname_index" now contains 0 row versions in 1
pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO: "pg_publication": found 0 removable, 0 nonremovable row versions in 0
out of 0 pages
DETAIL: 0 dead row versions cannot be removed yet, oldest xmin:
2146520116
There were 0 unused item identifiers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
WARNING: database "emet_zhen" must be vacuumed within 999995 transactions
HINT: To avoid a database shutdown, execute a database-wide VACUUM in that
database.
You might also need to commit or roll back old prepared transactions, or
drop stale replication slots.

I out of clues of what to try next. I already got this situation with PG 9.x
& PG10.x but system wide in exclusive mode usualy worked.

Seems like a PG12 bug that will certainly prevent us from upgrading even if
the new fonctionnalities look really great.

Thanks a lot!

Thierry

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Andres Freund 2019-06-07 19:02:31 Re: BUG #15840: Vacuum does not work after database stopped for wraparound protection. Database seems unrepearable.
Previous Message David G. Johnston 2019-06-07 16:06:44 Re: BUG #15839: Using text field for sorting in prepared query leads to wrong result

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2019-06-07 18:27:58 Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly
Previous Message Alexander Korotkov 2019-06-07 18:07:12 Re: [PROPOSAL] Drop orphan temp tables in single-mode