Re: pg_upgrade failing for 200+ million Large Objects

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: "Tharakan, Robins" <tharar(at)amazon(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_upgrade failing for 200+ million Large Objects
Date: 2021-03-08 12:33:58
Message-ID: CABUevEwyLb9VE0D+bAQtUnaA7bffXYzBpopYuh7kGTQxY9T5_g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 8, 2021 at 12:02 PM Tharakan, Robins <tharar(at)amazon(dot)com> wrote:
>
> Thanks Peter.
>
> The original email [1] had some more context that somehow didn't get
> associated with this recent email. Apologies for any confusion.

Please take a look at your email configuration -- all your emails are
lacking both References and In-reply-to headers, so every email starts
a new thread, both for each reader and in the archives. It seems quite
broken. It makes it very hard to follow.

> In short, pg_resetxlog (and pg_resetwal) employs a magic constant [2] (for
> both v9.6 as well as master) which seems to have been selected to force an
> aggressive autovacuum as soon as the upgrade completes. Although that works
> as planned, it narrows the window of Transaction IDs available for the
> upgrade (before which XID wraparound protection kicks and aborts the
> upgrade) to 146 Million.
>
> Reducing this magic constant allows a larger XID window, which is what the
> patch is trying to do. With the patch, I was able to upgrade a cluster with
> 500m Large Objects successfully (which otherwise reliably fails). In the
> original email [1] I had also listed a few other possible workarounds, but
> was unsure which would be a good direction to start working on.... thus this
> patch to make a start.

This still seems to just fix the symptoms and not the actual problem.

What part of the pg_upgrade process is it that actually burns through
that many transactions?

Without looking, I would guess it's the schema reload using
pg_dump/pg_restore and not actually pg_upgrade itself. This is a known
issue in pg_dump/pg_restore. And if that is the case -- perhaps just
running all of those in a single transaction would be a better choice?
One could argue it's still not a proper fix, because we'd still have a
huge memory usage etc, but it would then only burn 1 xid instead of
500M...

AFAICT at a quick check, pg_dump in binary upgrade mode emits one
lo_create() and one ALTER ... OWNER TO for each large object - so with
500M large objects that would be a billion statements, and thus a
billion xids. And without checking, I'm fairly sure it doesn't load in
a single transaction...

--
Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2021-03-08 12:40:46 Re: Huge memory consumption on partitioned table with FKs
Previous Message Tatsuo Ishii 2021-03-08 11:52:41 Re: Using COPY FREEZE in pgbench