Re: Improving connection scalability: GetSnapshotData()

From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Peter Geoghegan <pg(at)bowt(dot)ie>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Jonathan Katz <jkatz(at)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>
Subject: Re: Improving connection scalability: GetSnapshotData()
Date: 2020-04-07 19:43:27
Message-ID: 20200407194327.6u67yoqe3n4hdwmu@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2020-04-07 15:03:46 -0400, Robert Haas wrote:
> On Tue, Apr 7, 2020 at 1:51 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > > ComputedHorizons seems like a fairly generic name. I think there's
> > > some relationship between InvisibleToEveryoneState and
> > > ComputedHorizons that should be brought out more clearly by the naming
> > > and the comments.
> >
> > I don't like the naming of ComputedHorizons, ComputeTransactionHorizons
> > much... But I find it hard to come up with something that's meaningfully
> > better.
>
> It would help to stick XID in there, like ComputedXIDHorizons. What I
> find really baffling is that you seem to have two structures in the
> same file that have essentially the same purpose, but the second one
> (ComputedHorizons) has a lot more stuff in it. I can't understand why.

ComputedHorizons are the various "accurate" horizons computed by
ComputeTransactionHorizons(). That's used to determine a horizon for
vacuuming (via GetOldestVisibleTransactionId()) and other similar use
cases.

The various InvisibleToEveryoneState variables contain the boundary
based horizons, and are updated / initially filled by
GetSnapshotData(). When the a tested value falls between the boundaries,
we update the approximate boundaries using
ComputeTransactionHorizons(). That briefly makes the boundaries in
the InvisibleToEveryoneState accurate - but future GetSnapshotData()
calls will increase the definitely_needed_bound (if transactions
committed since).

The ComputedHorizons fields could instead just be pointer based
arguments to ComputeTransactionHorizons(), but that seems clearly
worse.

I'll change ComputedHorizons's comment to say that it's the result of
ComputeTransactionHorizons(), not the "state".

> > What's the inconsistency? The dropped replication_ vs dropped FullXid
> > postfix?
>
> Yeah, just having the member names be randomly different between the
> structs. Really harms greppability.

The long names make it hard to keep line lengths in control, in
particular when also involving the long macro names for TransactionId /
FullTransactionId comparators...

> > > Generally, heap_prune_satisfies_vacuum looks pretty good. The
> > > limited_oldest_committed naming is confusing, but the comments make it
> > > a lot clearer.
> >
> > I didn't like _committed much either. But couldn't come up with
> > something short. _relied_upon?
>
> oldSnapshotLimitUsed or old_snapshot_limit_used, like currentCommandIdUsed?

Will go for old_snapshot_limit_used, and rename the other variables to
old_snapshot_limit_ts, old_snapshot_limit_xmin.

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Dave Cramer 2020-04-07 19:45:57 Re: Binary support for pgoutput plugin
Previous Message Dmitry Dolgov 2020-04-07 19:42:20 Re: Index Skip Scan