Re: snapshot too old issues, first around wraparound and then more.

From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Kevin Grittner <kgrittn(at)gmail(dot)com>
Subject: Re: snapshot too old issues, first around wraparound and then more.
Date: 2020-04-03 00:17:13
Message-ID: 20200403001713.3ukjhhqwscsgf2rl@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2020-04-01 12:02:18 -0400, Robert Haas wrote:
> I have no objection to the idea that *if* the feature is hopelessly
> broken, it should be removed.

I don't think we have a real choice here at this point, at least for the
back branches.

Just about nothing around old_snapshot_threshold works correctly:

* There are basically no tests (see [1] I jsut sent, and also
old_snapshot_threshold bypassing a lot of the relevant code).

* We don't detect errors after hot pruning (to allow that is a major
point of the feature) when access is via any sort of index
scans. Wrong query results.

* The time->xid mapping is is entirely broken. We don't prevent bloat
for many multiples of old_snapshot_threshold (if above 1min).

It's possible, but harder, to have this cause wrong query results.

* In read-mostly workloads it can trigger errors in sessions that are
much younger than old_snapshot_threshold, if the transactionid is not
advancing.

I've not tried to reproduce, but I suspect this can also cause wrong
query results. Because a later snapshot can have the same xmin as
older transactions, it sure looks like we can end up with data from an
older xmin getting removed, but the newer snapshot's whenTaken will
prevent TestForOldSnapshot_impl() from raising an error.

* I am fairly sure that it can cause crashes (or even data corruption),
because it assumes that DML never needs to check for old snapshots
(with no meaningful justificiation). Leading to heap_update/delete to
assume the page header is a tuple.

* There's obviously also the wraparound issue that made me start this
thread initially.

Since this is a feature that can result in wrong query results (and
quite possibly crashes / data corruption), I don't think we can just
leave this unfixed. But given the amount of code / infrastructure
changes required to get this into a working feature, I don't see how we
can unleash those changes onto the stable branches.

There's quite a few issues in here that require not just local bugfixes,
but some design changes too. And it's pretty clear that the feature
didn't go through enough review before getting committed. I see quite
some merit in removing the code in master, and having a potential
reimplementation go through a normal feature integration process.

I don't really know what to do here. Causing problems by neutering a
feature in the back branch *sucks*. While not quite as bad, removing a
feature without a replacement in a major release is pretty harsh
too. But I don't really see any other realistic path forward.

FWIW, I've now worked around the interdependency of s_t_o my snapshot
scalability patch (only took like 10 days). I have manually confirmed it
works with 0/1 minute thresholds. I can make the tests pass unmodified
if I just add SetOldSnapshotThresholdTimestamp() calls when not
necessary (which obviously makes no sense). Lead to some decent
improvements around pruning that are independent of s_t_o (with more
possibilities "opened"). But I still think we need to do something
here.

Greetings,

Andres Freund

[1] https://postgr.es/m/20200403001235.e6jfdll3gh2ygbuc%40alap3.anarazel.de

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2020-04-03 00:20:28 Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Previous Message Andres Freund 2020-04-03 00:12:35 Re: snapshot too old issues, first around wraparound and then more.