pg_upgrade failing for 200+ million Large Objects

From: "Tharakan, Robins" <tharar(at)amazon(dot)com>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: pg_upgrade failing for 200+ million Large Objects
Date: 2021-03-03 11:36:26
Message-ID: 12601596dbbc4c01b86b4ac4d2bd4d48@EX13D05UWC001.ant.amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

While reviewing a failed upgrade from Postgres v9.5 (to v9.6) I saw that the
instance had ~200 million (in-use) Large Objects. I was able to reproduce
this on a test instance which too fails with a similar error.

pg_restore: executing BLOB 4980622
pg_restore: WARNING: database with OID 0 must be vacuumed within 1000001
transactions
HINT: To avoid a database shutdown, execute a database-wide VACUUM in that
database.
You might also need to commit or roll back old prepared transactions.
pg_restore: executing BLOB 4980623
pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 2565; 2613 4980623 BLOB
4980623 postgres
pg_restore: [archiver (db)] could not execute query: ERROR: database is not
accepting commands to avoid wraparound data loss in database with OID 0
HINT: Stop the postmaster and vacuum that database in single-user mode.
You might also need to commit or roll back old prepared transactions.
Command was: SELECT pg_catalog.lo_create('4980623');

To remove the obvious possibilities, these Large Objects that are still
in-use (so vacuumlo wouldn't help), giving more system resources doesn't
help, moving Large Objects around to another database doesn't help (since
this is cluster-wide restriction), the source instance is nowhere close to
wraparound and lastly recent-most minor versions don't help either (I tried
compiling 9_6_STABLE + upgrade database with 150 million LO and still
encountered the same issue).

Do let me know if I am missing something obvious but it appears that this is
happening owing to 2 things coming together:

* Each Large Object is migrated in its own transaction during pg_upgrade
* pg_resetxlog appears to be narrowing the window (available for pg_upgrade)
to ~146 Million XIDs (2^31 - 1 million XID wraparound margin - 2 billion
which is a hard-coded constant - see [1] - in what appears to be an attempt
to force an Autovacuum Wraparound session soon after upgrade completes).

Ideally such an XID based restriction, is limiting for an instance that's
actively using a lot of Large Objects. Besides forcing AutoVacuum Wraparound
logic to kick in soon after, I am unclear what much else it aims to do. What
it does seem to be doing is to block Major Version upgrades if the
pre-upgrade instance has >146 Million Large Objects (half that, if the LO
additionally requires ALTER LARGE OBJECT OWNER TO for each of those objects
during pg_restore)

For long-term these ideas came to mind, although am unsure which are
low-hanging fruits and which outright impossible - For e.g. clubbing
multiple objects in a transaction [2] / Force AutoVacuum post upgrade (and
thus remove this limitation altogether) or see if "pg_resetxlog -x" (from
within pg_upgrade) could help in some way to work-around this limitation.

Is there a short-term recommendation for this scenario?

I can understand a high number of small-sized objects is not a great way to
use pg_largeobject (since Large Objects was intended to be for, well, 'large
objects') but this magic number of Large Objects is now a stalemate at this
point (with respect to v9.5 EOL).

Reference:
1) pg_resetxlog -
https://github.com/postgres/postgres/blob/ca3b37487be333a1d241dab1bbdd17a211
a88f43/src/bin/pg_resetwal/pg_resetwal.c#L444
2)
https://www.postgresql.org/message-id/ed7d86a1-b907-4f53-9f6e-63482d2f2bac%4
0manitou-mail.org

-
Thanks
Robins Tharakan

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2021-03-03 11:44:44 Re: Parallel INSERT (INTO ... SELECT ...)
Previous Message Masahiro Ikeda 2021-03-03 11:27:29 Re: About to add WAL write/fsync statistics to pg_stat_wal view