Assorted improvements in pg_dump

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Assorted improvements in pg_dump
Date: 2021-10-20 21:14:45
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Attached is a series of patches that address some performance and
correctness problems we've recently identified in pg_dump.

Patch 0001 refactors the way pg_dump keeps track of which "components"
(definition, data, comment, ACL, etc) of a dumpable object need to be
dumped. The problem addressed here is that the current coding breaks
the intended optimization that we don't run pg_dump's secondary data
collection queries for objects we're not actually going to dump.
That's because we initialize DumpableObject.dump to DUMP_COMPONENT_ALL
(0xFFFF), and then we're not very careful about clearing meaningless
bits out of that. That leads to the bitmask not being zero when it's
tested at the start of a per-object dump function such as dumpFunc(),
even though DUMP_COMPONENT_DEFINITION might be clear and the object
might not have any comment, security label, or ACL. An example
of the problem here is that we'll end up running through dumpFunc()
for every function defined in an extension, even though the only
ones we'd print anything for are those with modified ACLs. In a
database with a lot of extensions, that results in a lot of useless
queries and consequent performance problems [1].

There are a couple of ways we could rewrite this, but what seemed
to me to be the clearest and most robust is to separate
DumpableObject.dump into two bitmasks, one recording the components
we've requested to dump and the other recording the components we've
actually found the object to possess. (The existing logic conflates
these purposes by setting and later clearing bits, which I think is
confusing and also bug-prone.) Then, when we reach the point where we
need to decide if there's anything to do, we can AND those two masks
together to derive the set of things we are really going to dump.

So the patch adds a "components" field with the same bits that "dump"
has, and arranges to set appropriate bits of that field as we discover
that objects have comments, ACLs, etc. (I wonder if we should rename
"dump", perhaps to "requests" or the like. But I didn't do so here.)
It's straightforward other than that, with just a few things maybe
worthy of note:
1. collectComments and collectSecLabels are now run unconditionally,
not on-demand, because we need those bit-setting actions to happen
before we start dumping objects.
2. I had to add collection of pg_class.reltype so that when we find
a comment or seclabel for a composite type's column, we can redirect
the bit-setting action from the pg_class entry to the pg_type entry.
3. Testing for no-components-to-dump is now centralized in
dumpDumpableObject(), instead of each dumpXXX function doing it.

Patch 0002 rethinks the way we handle dumping of ACLs. My main
goal here was to get rid of the expensive sub-selects that the
pg_init_privs patch added. The core idea is to drop all of those
in favor of just reading pg_init_privs once and loading the
information into the DumpableObjects, much the same way as we
handle comments and seclabels. I also realized that that patch's
insistence on doing assembly/disassembly of ACLs in SQL was a big
performance loser. We can calculate the delta between two ACLs
right in buildACLCommands, just by doing string comparisons after
we've disassembled the aclitems array into elements, so it's really
pretty cheap.

This also led me to the conclusion that it was a bad idea to have
preserved the "old" (pre-9.6) logic in buildACLCommands. The new
approach of calculating "add" and "remove" lists as deltas from an
acldefault() value works perfectly well before 9.6, at least for
branches back to 9.2 where acldefault() was added. For older servers,
I thought briefly about putting hard-wired knowledge of the older
branches' default ACLs into dumputils.c, but rejected that in favor of
emitting REVOKE ALL and then emitting all the ACL's items when we are
lacking acldefault() results. This turns out to be just as efficient,
and sometimes more so, as what the old code was doing. For example,
comparing pg_dump of the 9.1 regression database between old and new
logic, the only difference is

@@ -60,9 +60,7 @@
-- Name: SCHEMA public; Type: ACL; Schema: -; Owner: postgres

-REVOKE ALL ON SCHEMA public FROM postgres;
-GRANT ALL ON SCHEMA public TO postgres;

for a net savings of two commands. The REVOKE USAGE appears because
of the hacks Noah added in a7a7be1f2 and b073c3ccd to cause pg_dump
to believe that schema public has a specific initprivs value even
when it doesn't. In HEAD, that code only triggers for dumps from
servers >= 9.6, but this patch makes it apply before that too, which
seems correct to me. A target server >= v15 is going to have that
ACL for the public schema, no matter what the source version is.

(BTW, to replicate that behavior I ended up having to write some
client-side functions to construct the text form of an aclitems array.
The duplication of logic with aclitemout() is a bit annoying. Not
sure if it's worth trying to refactor to combine code.)

0002 does not yet incorporate any logic change corresponding to
the bug fix under discussion at [2], but it will be easy to add.
I left that out for now so as not to change any test results.

Lastly, patch 0003 addresses the concern I raised at [3] that it's
unsafe to call pg_get_partkeydef() and pg_get_expr(relpartbound)
in getTables(). Looking closer I realized that we can't cast
pg_class.reloftype to regtype at that point either, since regtypeout
is going to notice if the type has been concurrently dropped.

In [3] I'd imagined that we could just delay those calls to a second
query in getTables(), but that doesn't work at all: if we apply
these functions to every row of pg_class, we still risk failure
against any relation that we didn't lock. So there seems little
alternative but to push these functions out to secondary queries
executed later.

Arguably, 0003 is a bug fix that we should consider back-patching.
However, I've not heard field reports of the problems it fixes,
so maybe there's no need to bother.

Putting all of this together, I did some performance measurements
comparing HEAD pg_dump to pg_dump with these three patches. On
my machine, using the current regression database as a test case,
I find that "pg_dump -s regression >/dev/null" requires 1.21 sec
on HEAD and 0.915 sec with these patches, or about 24% faster.
Even more importantly, the time to execute getTables' query drops
from 129 ms to 16 ms. Since that's the window before we can start
to acquire table locks, making it as short as possible is useful.

To have another data point, I also experimented with a database
constructed like this:

do $$
for i in 1..2500 loop
execute 'create table tst' || i || ' (f1 int primary key, f2 text)';
end loop;
end $$;

For that, "pg_dump -s" required 4.94 sec vs 4.38 sec, or about 11%
faster, and the getTables query dropped from 316 ms to 41 ms.

There's some other micro-optimizations I'm thinking about, but this
probably gets most of the available win, and it seems like a coherent
set of changes to present at once.

regards, tom lane


Attachment Content-Type Size
0001-fix-component-masking-1.patch text/x-diff 47.1 KB
0002-rethink-ACL-handling-1.patch text/x-diff 120.2 KB
0003-move-unsafe-function-calls-1.patch text/x-diff 9.7 KB

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-10-20 21:46:01 Re: Can we get rid of repeated queries from pg_dump?
Previous Message Stephen Frost 2021-10-20 20:36:06 Re: Delegating superuser tasks to new security roles (Was: Granting control of SUSET gucs to non-superusers)