Re: snapshot too old issues, first around wraparound and then more.

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Kevin Grittner <kgrittn(at)gmail(dot)com>
Subject: Re: snapshot too old issues, first around wraparound and then more.
Date: 2021-06-15 21:12:27
Message-ID: CAH2-Wz=dOHuefXH_weCKjO6sq9TcZoaPJsTuKLyW7m_R5EsThg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jun 15, 2021 at 12:49 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> > My general point here is that I would like to know whether we have a
> > finite number of reasonably localized bugs or a three-ring disaster
> > that is unrecoverable no matter what we do. Andres seems to think it
> > is the latter, and I *think* Peter Geoghegan agrees, but I think that
> > the point might be worth a little more discussion.
>
> TBH, I am not clear on that either.

I don't know for sure which it is, but that in itself isn't actually
what matters to me. The most concerning thing is that I don't really
know how to *assess* the design now. The clear presence of at least
several very severe bugs doesn't necessarily prove anything (it just
*hints* at major design problems).

If I could make a very clear definitive statement on this then I'd
probably have to do ~1/3 of the total required work -- that'd be my
guess. If it was easy to be quite sure here then we wouldn't still be
here 12 months later. In any case I don't think that the feature
deserves to be treated all that differently to something that was
committed much more recently, given what we know. Frankly it took me
about 5 minutes to find a very serious bug in the feature, pretty much
without giving it any thought. That is not a good sign.

> I think it's a klugy, unprincipled solution to a valid real-world
> problem. I suspect the implementation issues are not unrelated to
> the kluginess of the concept. Thus, I would really like to see us
> throw this away and find something better. I admit I have nothing
> to offer about what a better solution to the problem would look like.
> But I would really like it to not involve random-seeming query failures.

I would be very happy to see somebody take this up, because it is
important. The reality is that anybody that undertakes this task
should start with the assumption that they're starting from scratch,
at least until they learn otherwise. So ISTM that it might as well be
true that it needs a total rewrite, even if it turns out to not be
strictly true.

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2021-06-15 21:32:11 Re: snapshot too old issues, first around wraparound and then more.
Previous Message Tom Lane 2021-06-15 21:09:00 Re: Improving the isolationtester: fewer failures, less delay