Re: Improving connection scalability: GetSnapshotData()

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Peter Geoghegan <pg(at)bowt(dot)ie>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Jonathan Katz <jkatz(at)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>
Subject: Re: Improving connection scalability: GetSnapshotData()
Date: 2020-04-07 20:13:07
Message-ID: CA+TgmoZg34dKfTWOKjvxYU6mpE1=NkgLaS1hciqH+qXk2tv36A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Apr 7, 2020 at 3:24 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > 0007 -
> >
> > + TransactionId xidCopy; /* this backend's xid, a copy of this proc's
> > + ProcGlobal->xids[] entry. */
> >
> > Can we please NOT put Copy into the name like that? Pretty please?
>
> Do you have a suggested naming scheme? Something indicating that it's
> not the only place that needs to be updated?

I don't think trying to indicate that in the structure member names is
a useful idea. I think you should give them the same names, maybe with
an "s" to pluralize the copy hanging off of ProcGlobal, and put a
comment that says something like:

We keep two copies of each of the following three fields. One copy is
here in the PGPROC, and the other is in a more densely-packed array
hanging off of PGXACT. Both copies of the value must always be updated
at the same time and under the same locks, so that it is always the
case that MyProc->xid == ProcGlobal->xids[MyProc->pgprocno] and
similarly for vacuumFlags and WHATEVER. Note, however, that the arrays
attached to ProcGlobal only contain entries for PGPROC structures that
are currently part of the ProcArray (i.e. there is currently a backend
for that PGPROC). We use those arrays when STUFF and the copies in the
individual PGPROC when THINGS.

> I think it's more on-point here, because we need to hold either of the
> locks* even, for changes to a backend's own status that one reasonably
> could expect would be safe to at least inspect.

It's just too brief and obscure to be useful.

> > + ProcGlobal->xids[pgxactoff] = InvalidTransactionId;
> >
> > Apparently this array is not dense in the sense that it excludes
> > unused slots, but comments elsewhere don't seem to entirely agree.
>
> What do you mean with "unused slots"? Backends that committed?

Backends that have no XID. You mean, I guess, that it is "dense" in
the sense that only live backends are in there, not "dense" in the
sense that only active write transactions are in there. It would be
nice to nail that down better; the wording I suggested above might
help.

> > + uint32 TotalProcs = MaxBackends + NUM_AUXILIARY_PROCS + max_prepared_xacts;
> >
> > /* ProcGlobal */
> > size = add_size(size, sizeof(PROC_HDR));
> > - /* MyProcs, including autovacuum workers and launcher */
> > - size = add_size(size, mul_size(MaxBackends, sizeof(PGPROC)));
> > - /* AuxiliaryProcs */
> > - size = add_size(size, mul_size(NUM_AUXILIARY_PROCS, sizeof(PGPROC)));
> > - /* Prepared xacts */
> > - size = add_size(size, mul_size(max_prepared_xacts, sizeof(PGPROC)));
> > - /* ProcStructLock */
> > + size = add_size(size, mul_size(TotalProcs, sizeof(PGPROC)));
> >
> > This seems like a bad idea. If we establish a precedent that it's OK
> > to have sizing routines that don't use add_size() and mul_size(),
> > people are going to cargo cult that into places where there is more
> > risk of overflow than there is here.
>
> Hm. I'm not sure I see the problem. Are you concerned that TotalProcs
> would overflow due to too big MaxBackends or max_prepared_xacts? The
> multiplication itself is still protected by add_size(). It didn't seem
> correct to use add_size for the TotalProcs addition, since that's not
> really a size. And since the limit for procs is much lower than
> UINT32_MAX...

I'm concerned that there are 0 uses of add_size in any shared-memory
sizing function, and I think it's best to keep it that way. If you
initialize TotalProcs = add_size(MaxBackends,
add_size(NUM_AUXILIARY_PROCS, max_prepared_xacts)) then I'm happy. I
think it's a desperately bad idea to imagine that we can dispense with
overflow checks here and be safe. It's just too easy for that to
become false due to future code changes, or get copied to other places
where it's unsafe now.

> > You've got a bunch of different places that talk about the new PGXACT
> > array and they are somewhat redundant yet without saying exactly the
> > same thing every time either. I think that needs cleanup.
>
> Could you point out a few of those comments, I'm not entirely sure which
> you're talking about?

+ /*
+ * Also allocate a separate arrays for data that is frequently (e.g. by
+ * GetSnapshotData()) accessed from outside a backend. There is one entry
+ * in each for every *live* PGPROC entry, and they are densely packed so
+ * that the first procArray->numProc entries are all valid. The entries
+ * for a PGPROC in those arrays are at PGPROC->pgxactoff.
+ *
+ * Note that they may not be accessed without ProcArrayLock held! Upon
+ * ProcArrayRemove() later entries will be moved.
+ *
+ * These are separate from the main PGPROC array so that the most heavily
+ * accessed data is stored contiguously in memory in as few cache lines as
+ * possible. This provides significant performance benefits, especially on
+ * a multiprocessor system.
+ */

+ * Arrays with per-backend information that is hotly accessed, indexed by
+ * PGPROC->pgxactoff. These are in separate arrays for three reasons:
+ * First, to allow for as tight loops accessing the data as
+ * possible. Second, to prevent updates of frequently changing data from
+ * invalidating cachelines shared with less frequently changing
+ * data. Third to condense frequently accessed data into as few cachelines
+ * as possible.

+ *
+ * The various *Copy fields are copies of the data in ProcGlobal arrays that
+ * can be accessed without holding ProcArrayLock / XidGenLock (see PROC_HDR
+ * comments).

+ * Adding/Removing an entry into the procarray requires holding *both*
+ * ProcArrayLock and XidGenLock in exclusive mode (in that order). Both are
+ * needed because the dense arrays (see below) are accessed from
+ * GetNewTransactionId() and GetSnapshotData(), and we don't want to add
+ * further contention by both using one lock. Adding/Removing a procarray
+ * entry is much less frequent.

I'm not saying these are all entirely redundant with each other;
that's not so. But I don't think it gives a terribly clear grasp of
the overall picture either, even taking all of them together.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Floris Van Nee 2020-04-07 20:19:08 RE: Index Skip Scan
Previous Message Hamid Akhtar 2020-04-07 20:10:54 Re: BUG #16346: pg_upgrade fails on a trigger with a comment