Re: Spurious "apparent wraparound" via SimpleLruTruncate() rounding

From: Noah Misch <noah(at)leadboat(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Spurious "apparent wraparound" via SimpleLruTruncate() rounding
Date: 2020-03-30 05:28:09
Message-ID: 20200330052809.GB2324620@rfd.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 25, 2020 at 04:42:31PM -0400, Tom Lane wrote:
> Noah Misch <noah(at)leadboat(dot)com> writes:
> > On Thu, Mar 19, 2020 at 06:04:52PM -0400, Tom Lane wrote:
> >> 3. It feels like the proposed test of cutoff position against both
> >> ends of a segment is a roundabout way of fixing the problem. I
> >> wonder whether we ought not pass *both* the cutoff and the current
> >> endpoint (latest_page_number) down to the truncation logic, and
> >> have it compare against both of those values.
>
> > Since latest_page_number can keep changing throughout SlruScanDirectory()
> > execution, that would give a false impression of control. Better to
> > demonstrate that the xidWrapLimit machinery keeps latest_page_number within
> > acceptable constraints than to ascribe significance to a comparison with a
> > stale latest_page_number.
>
> Perhaps. I'm prepared to accept that line of argument so far as the clog
> SLRU goes, but I'm not convinced that the other SLRUs have equally robust
> defenses against advancing too far. So on the whole I'd rather that the
> SLRU logic handled this issue strictly on the basis of what it knows,
> without assumptions about what calling code may be doing. Still, maybe
> we only really care about the risk for the clog SLRU?

PerformOffsetsTruncation() is the most at-risk, since a single VACUUM could
burn millions of multixacts via FreezeMultiXactId() calls. (To make that
happen in single-user mode, I suspect one could use prepared transactions as
active lockers and/or in-progress updaters.) I'm not concerned about other
SLRUs. TruncateCommitTs() moves in lockstep with TruncateCLOG(). The other
SimpleLruTruncate() callers handle data that becomes obsolete at every
postmaster restart.

> > Exactly.
> > https://docs.google.com/drawings/d/1xRTbQ4DVyP5wI1Ujm_gmmY-cC8KKCjahEtsU_o0fC7I
> > uses your octagon to show the behaviors before and after this patch.
>
> Cool, thanks for drafting that up. (My original sketch was not of
> publishable quality ;-).) To clarify, the upper annotations probably
> ought to read "nextXid <= xidWrapLimit"?

It diagrams the scenario of nextXid reaching xidWrapLimit, so the green dot
represents both values.

> And "cutoffPage" ought
> to be affixed to the orange dot at lower right of the center image?

No; oldestXact and cutoffPage have the same position in that diagram, because
the patch causes the cutoffPage variable to denote the page that contains
oldestXact. I've now added an orange dot to show that.

> I agree that this diagram depicts why we have a problem right now,
> and the right-hand image shows what we want to have happen.
> What's a little less clear is whether the proposed patch achieves
> that effect.
>
> In particular, after studying this awhile, it seems like removal
> of the initial "cutoffPage -= cutoffPage % SLRU_PAGES_PER_SEGMENT"
> adjustment isn't really affecting anything.

True. The set of unlink() calls needs to be the same for oldestXact in the
first page of a segment, in the last page, or in some interior page. Removing
the rounding neither helps nor hurts correctness.

> So I think what we're actually trying to accomplish here is to
> ensure that instead of deleting up to half of the SLRU space
> before the cutoff, we delete up to half-less-one-segment.
> Maybe it should be half-less-two-segments, just to provide some
> cushion against edge cases. Reading the first comment in
> SetTransactionIdLimit makes one not want to trust too much in
> arguments based on the exact value of xidWrapLimit, while for
> the other SLRUs it was already unclear whether the edge cases
> were exactly right.

That could be interesting insurance. While it would be sad for us to miss an
edge case and print "must be vacuumed within 2 transactions" when wrap has
already happened, reaching that message implies the DBA burned ~1M XIDs, all
in single-user mode. More plausible is FreezeMultiXactId() overrunning the
limit by tens of segments. Hence, if we do buy this insurance, let's skip far
more segments. For example, instead of unlinking segments representing up to
2^31 past XIDs, we could divide that into an upper half that we unlink and a
lower half. The lower half will stay in place; eventually, XID consumption
will overwrite it. Truncation behavior won't change until the region of CLOG
for pre-oldestXact XIDs exceeds 256 MiB. Beyond that threshold,
vac_truncate_clog() will unlink the upper 256 MiB and leave the rest. CLOG
maximum would rise from 512 MiB to 768 MiB. Would that be worthwhile?

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2020-03-30 05:56:11 Re: [HACKERS] WAL logging problem in 9.4.3?
Previous Message Fabien COELHO 2020-03-30 05:16:17 Re: pg11+: pg_ls_*dir LIMIT 1: temporary files .. not closed at end-of-transaction