Re: pg_upgrade failing for 200+ million Large Objects

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, "Kumar, Sachin" <ssetiya(at)amazon(dot)com>, Robins Tharakan <tharakan(at)gmail(dot)com>, Jan Wieck <jan(at)wi3ck(dot)info>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Magnus Hagander <magnus(at)hagander(dot)net>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_upgrade failing for 200+ million Large Objects
Date: 2024-03-15 23:18:41
Message-ID: 2751089.1710544721@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

This patch seems to have stalled out again. In hopes of getting it
over the finish line, I've done a bit more work to address the two
loose ends I felt were probably essential to deal with:

* Duplicative blob ACLs are now merged into a single TOC entry
(per metadata group) with the GRANT/REVOKE commands stored only
once. This is to address the possibly-common case where a database
has a ton of blobs that have identical-but-not-default ACLs.

I have not done anything about improving efficiency for blob comments
or security labels. I think it's reasonable to assume that blobs with
comments are pets not cattle, and there won't be many of them.
I suppose it could be argued that seclabels might be used like ACLs
with a lot of duplication, but I doubt that there's anyone out there
at all putting seclabels on blobs in practice. So I don't care to
expend effort on that.

* Parallel pg_upgrade cuts the --transaction-size given to concurrent
pg_restore jobs by the -j factor. This is to ensure we keep the
shared locks table within bounds even in parallel mode.

Now we could go further than that and provide some direct user
control over these hard-wired settings, but I think that could
be left for later, getting some field experience before we design
an API. In short, I think this patchset is more or less commitable.

0001-0004 are rebased up to HEAD, but differ only in line numbers
from the v10 patchset. 0005 handles ACL merging, and 0006 does
the other thing.

regards, tom lane

Attachment Content-Type Size
v11-0001-Some-small-preliminaries-for-pg_dump-changes.patch text/x-diff 5.9 KB
v11-0002-In-dumps-group-large-objects-into-matching-metad.patch text/x-diff 39.8 KB
v11-0003-Move-BLOBS-METADATA-TOC-entries-into-SECTION_DAT.patch text/x-diff 3.0 KB
v11-0004-Invent-transaction-size-option-for-pg_restore.patch text/x-diff 15.4 KB
v11-0005-Improve-storage-efficiency-for-BLOB-ACLs.patch text/x-diff 17.3 KB
v11-0006-Be-more-conservative-about-transaction-size-in-p.patch text/x-diff 2.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2024-03-15 23:27:15 Re: Vectored I/O in bulk_write.c
Previous Message Melanie Plageman 2024-03-15 22:42:29 Re: BitmapHeapScan streaming read user and prelim refactoring