Re: pg_upgrade failing for 200+ million Large Objects

From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Kumar, Sachin" <ssetiya(at)amazon(dot)com>, Jan Wieck <jan(at)wi3ck(dot)info>, Bruce Momjian <bruce(at)momjian(dot)us>, Zhihong Yu <zyu(at)yugabyte(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Magnus Hagander <magnus(at)hagander(dot)net>, Robins Tharakan <tharakan(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_upgrade failing for 200+ million Large Objects
Date: 2023-12-21 03:16:14
Message-ID: 20231221031614.GA836232@nathanxps13
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Dec 20, 2023 at 06:47:44PM -0500, Tom Lane wrote:
> I have spent some more effort in this area and developed a patch
> series that I think addresses all of the performance issues that
> we've discussed in this thread, both for pg_upgrade and more
> general use of pg_dump/pg_restore. Concretely, it absorbs
> the pg_restore --transaction-size switch that I proposed before
> to cut the number of transactions needed during restore, and
> rearranges the representation of BLOB-related TOC entries to
> reduce the client-side memory requirements, and fixes some
> ancient mistakes that prevent both selective restore of BLOBs
> and parallel restore of BLOBs.
>
> As a demonstration, I made a database containing 100K empty blobs,
> and measured the time needed to dump/restore that using -Fd
> and -j 10. HEAD doesn't get any useful parallelism on blobs,
> but with this patch series we do:
>
> dump restore
> HEAD: 14sec 15sec
> after 0002: 7sec 10sec
> after 0003: 7sec 3sec

Wow, thanks for putting together these patches. I intend to help review,
but I'm not sure I'll find much time to do so before the new year.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-12-21 03:17:27 Re: Function to get invalidation cause of a replication slot.
Previous Message Michael Paquier 2023-12-21 03:07:56 Re: Track in pg_replication_slots the reason why slots conflict?