Re: Potential (low likelihood) wraparound hazard in heap_abort_speculative()

From: Andres Freund <andres(at)anarazel(dot)de>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Potential (low likelihood) wraparound hazard in heap_abort_speculative()
Date: 2020-03-29 22:31:45
Message-ID: 20200329223145.3otke27tmjcj5rnj@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2020-03-29 15:20:01 -0700, Peter Geoghegan wrote:
> On Sat, Mar 28, 2020 at 2:30 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > Unless somebody has a better idea for how to solve this in a
> > back-paptchable way, I think it'd be best to replace RecentGlobalXmin
> > with RecentXmin. That'd be safe as far as I can see.
>
> As far as I can tell, the worst consequence of this wraparound hazard
> is that we don't opportunistically prune at some later point where we
> probably ought to. Do you agree with that assessment?

Probably, yes.

> Since pd_prune_xid is documented as "a hint field" in bufpage.h, this
> bug cannot possibly lead to queries that give wrong answers. The
> performance issue also seems like it should not have much impact,
> since we only call heap_abort_speculative() in extreme cases where
> there is a lot of contention among concurrent upserting sessions.

Well, I think it could be fairly "persistent" in being set in some
cases. PageSetPrunable() and heap_prune_record_prunable() check that a
new prune xid is newer than the current one.

That said, I still think it's unlikely to be really problematic.

> Also, as you pointed out already, RecentGlobalXmin is probably not
> going to be any different to RecentXmin.

Huh, I think they very commonly are radically different? Where did I
point that out? RecentXmin is the xmin of the last snapshot
computed. Whereas RecentGlobalXmin basically is the oldest xmin of any
backend. That's a pretty large difference? Especially with longrunning
sessions, replication, logical decoding those can be different by
hundreds of millions of xids.

Where did I point out that they're not going to be very different?

> I am in favor of fixing the issue, and backpatching all the way. I
> just want to put the issue in perspective, and have my own
> understanding of things verified.

Cool.

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2020-03-29 22:52:43 Re: Improving connection scalability: GetSnapshotData()
Previous Message Peter Geoghegan 2020-03-29 22:20:01 Re: Potential (low likelihood) wraparound hazard in heap_abort_speculative()