Re: Spurious "apparent wraparound" via SimpleLruTruncate() rounding

From: Noah Misch <noah(at)leadboat(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Spurious "apparent wraparound" via SimpleLruTruncate() rounding
Date: 2020-03-22 08:34:52
Message-ID: 20200322083452.GB1918808@rfd.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 19, 2020 at 06:04:52PM -0400, Tom Lane wrote:
> Yeah, this patch has been waiting in the queue for way too long :-(.

Thanks for reviewing.

> I spent some time studying this, and I have a few comments/questions:
>
> 1. It seems like this discussion is conflating two different issues.
> The original complaint was about a seemingly-bogus log message "could
> not truncate directory "pg_xact": apparent wraparound". Your theory
> about that, IIUC, is that SimpleLruTruncate's initial round-back of
> cutoffPage to a segment boundary moved us from a state where
> shared->latest_page_number doesn't logically precede the cutoffPage to
> one where it does, triggering the message. So removing the initial
> round-back, and then doing whatever's needful to compensate for that in
> the later processing, is a reasonable fix to prevent the bogus warning.
> However, you're also discussing whether or not an SLRU segment file that
> is close to the wraparound boundary should get removed or not. As far

When the newest XID and the oldest XID fall in "opposite" segments in the XID
space, we must not unlink the segment containing the newest XID. That is the
chief goal at present.

> as I can see that's 100% independent of issuance of the log message, no?

Perhaps confusing is that the first message of the thread and the subject line
contain wrong claims, which I corrected in the 2019-02-13 message[1]. Due to
point (2) in [1], it's essential to make the "apparent wraparound" message a
can't-happen event. Hence, I'm not looking to improve the message or its
firing conditions. I want to fix the excess segment deletion, at which point
the message will become unreachable except under single-user mode.

> 2. I wonder whether we have an issue even with rounding back to the
> SLRU page boundary, as is done by each caller before we ever get to
> SimpleLruTruncate. I'm pretty sure that the actual anti-wraparound
> protections are all precise to the XID level, so that there's some
> daylight between what SimpleLruTruncate is told and the range of
> data that the higher-level code thinks is of interest. Maybe we
> should restructure things so that we keep the original cutoff XID
> (or equivalent) all the way through, and compare the start/end
> positions of a segment file directly to that.

Currently, slru.c knows nothing about the division of pages into records.
Hmm. To keep oldestXact all the way through, I suppose the PagePrecedes
callback could become one or more record-oriented (e.g. XID-oriented)
callbacks. The current scheme just relies on TransactionIdToPage() in
TruncateCLOG(). If TransactionIdToPage() had a bug, all sorts of CLOG lookups
would do wrong things. Hence, I think today's scheme is tougher to get wrong.
Do you see it differently?

> 3. It feels like the proposed test of cutoff position against both
> ends of a segment is a roundabout way of fixing the problem. I
> wonder whether we ought not pass *both* the cutoff and the current
> endpoint (latest_page_number) down to the truncation logic, and
> have it compare against both of those values.

Since latest_page_number can keep changing throughout SlruScanDirectory()
execution, that would give a false impression of control. Better to
demonstrate that the xidWrapLimit machinery keeps latest_page_number within
acceptable constraints than to ascribe significance to a comparison with a
stale latest_page_number.

> To try to clarify this in my head, I thought about an image of
> the modular XID space as an octagon, where each side would correspond to
> a segment file if we chose numbers such that there were only 8 possible
> segment files. Let's say that nextXID is somewhere in the bottommost
> octagon segment. The oldest possible value for the truncation cutoff
> XID is a bit less than "halfway around" from nextXID; so it could be
> in the topmost octagon segment, if the minimum permitted daylight-
> till-wraparound is less than the SLRU segment size (which it is).
> Then, if we round the cutoff XID "back" to a segment boundary, most
> of the current (bottommost) segment is now less than halfway around
> from that cutoff point, and in particular the current segment's
> starting page is exactly halfway around. Because of the way that
> TransactionIdPrecedes works, the current segment will be considered to
> precede that cutoff point (the int32 difference comes out as exactly
> 2^31 which is negative). Boom, data loss, because we'll decide the
> current segment is removable.

Exactly.
https://docs.google.com/drawings/d/1xRTbQ4DVyP5wI1Ujm_gmmY-cC8KKCjahEtsU_o0fC7I
uses your octagon to show the behaviors before and after this patch.

> I think that your proposed patch does fix this, but I'm not quite
> sold that the corner cases (where the cutoff XID is itself exactly
> at a page boundary) are right.

That's a good thing to worry about. More specifically, I think the edge case
to check is when oldestXact is the last XID of a _segment_. That case
maximizes the XIDs we can delete. At that time, xidWrapLimit should likewise
fall near the end of some opposing segment that we refuse to unlink.

> It also bothers me that some of the callers of SimpleLruTruncate
> have explicit one-count backoffs of the cutoff point and others
> do not. There's no obvious reason for the difference, so I wonder
> if that isn't something we should have across-the-board, or else
> adjust SimpleLruTruncate to do the equivalent thing internally.

Consider the case of PerformOffsetsTruncation(). If newOldestMulti is the
first of a page, then SimpleLruTruncate() gets the previous page. If that
page and the newOldestMulti page fall in different segments, could we unlink
the segment that contained newOldestMulti, or does some other off-by-one
compensate? I'm not sure. I do know that to lose mxact data this way, one
must reach multiStopLimit, restart in single user mode, and consume all the
way to the edge of multiWrapLimit. Considering the rarity of those
preconditions, I am not inclined to bundle a fix with $SUBJECT. (Even OID
reuse race conditions may present more risk than this does.) If someone does
pursue a fix here, I recommend looking at other fixes before making the other
callers subtract one. The subtraction in TruncateSUBTRANS() and
PerformOffsetsTruncation() is a hack.

[1] https://postgr.es/m/20190214072623.GA1139206@rfd.leadboat.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Maxim Ivanov 2020-03-22 09:11:45 optimisation? collation "C" sorting for GroupAggregate for all deterministic collations
Previous Message Pavel Stehule 2020-03-22 07:40:33 Re: proposal: schema variables