Replacing pg_depend PIN entries with a fixed range check

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Replacing pg_depend PIN entries with a fixed range check
Date: 2021-04-15 01:43:28
Message-ID: 3737988.1618451008@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

In [1] Andres and I speculated about whether we really need all
those PIN entries in pg_depend. Here is a draft patch that gets
rid of them.

It turns out to be no big problem to replace the PIN entries
with an OID range check, because there's a well-defined point
in initdb where it wants to pin (almost) all existing objects,
and then no objects created after that are pinned. In the earlier
thread I'd imagined having initdb record the OID counter at that
point in pg_control, and then we could look at the recorded counter
value to make is-it-pinned decisions. However, that idea has a
fatal problem: what shall pg_resetwal fill into that field when
it has to gin up a pg_control file from scratch? There's no
good way to reconstruct the value.

Hence, what this patch does is to establish a manually-managed cutoff
point akin to FirstBootstrapObjectId, and make initdb push the OID
counter up to that once it's made the small number of pinned objects
it's responsible for. With the value I used here, a couple hundred
OIDs are wasted, but there seems to be a reasonable amount of headroom
still beyond that. On my build, the OID counter at the end of initdb
is 15485 (with a reasonable number of glibc and ICU locales loaded).
So we still have about 900 free OIDs there; and there are 500 or so
free just below FirstBootstrapObjectId, too. So this approach does
hasten the day when we're going to run out of free OIDs below 16384,
but not by all that much.

There are a couple of objects, namely template1 and the public
schema, that are in the catalog .dat files but are not supposed
to be pinned. The existing code accomplishes that by excluding them
(in two different ways :-() while filling pg_depend. This patch
just hard-wires exceptions for them in IsPinnedObject(), which seems
to me not much uglier than what we had before. The existing code
also handles pinning of the standard tablespaces in an idiosyncratic
way; I just dropped that and made them be treated as pinned.

One interesting point about doing things this way is that
IsPinnedObject() will give correct answers throughout initdb, whereas
before the backend couldn't tell what was supposed to be pinned until
after initdb loaded pg_depend. This means we don't need the hacky
truncation of pg_depend and pg_shdepend that initdb used to do,
because now the backend will correctly not make entries relating to
objects it now knows are pinned. Aside from saving a few cycles,
this is more correct. For example, if some object that initdb made
after bootstrap but before truncating pg_depend had a dependency on
the public schema, the existing coding would lose track of that fact.
(There's no live issue of that sort, I hasten to say, and really it
would be a bug to set things up that way because then you couldn't
drop the public schema. But the existing coding would make things
worse by not detecting the mistake.)

Anyway, as to concrete results:

* pg_depend's total relation size, in a freshly made database,
drops from 1269760 bytes to 368640 bytes.

* There seems to be a small but noticeable reduction in the time
to run check-world. I compared runtimes on a not-particularly-modern
machine with spinning-rust storage, using -j4 parallelism:

HEAD
real 5m4.248s
user 2m59.390s
sys 1m21.473s

+ patch
real 5m2.924s
user 2m36.196s
sys 1m19.724s

These top-line numbers don't look too impressive, but the CPU-time
reduction seems quite significant. Probably on a different hardware
platform that would translate more directly to runtime savings.

I didn't try to reproduce the original performance bottleneck
that was complained of in [1], but that might be fun to check.

Anyway, I'll stick this in the next CF so we don't lose track
of it.

regards, tom lane

[1] https://www.postgresql.org/message-id/flat/947172.1617684433%40sss.pgh.pa.us#6a3d250a9c4a994cb3a26c87384fc823

Attachment Content-Type Size
remove-pg_depend-PIN-entries-1.patch text/x-diff 42.8 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-04-15 01:51:47 Re: psql - add SHOW_ALL_RESULTS option
Previous Message Michael Paquier 2021-04-15 01:28:43 Re: Proposal: Save user's original authenticated identity for logging