Re: fixing old_snapshot_threshold's time->xid mapping

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Kevin Grittner <kgrittn(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: fixing old_snapshot_threshold's time->xid mapping
Date: 2020-04-18 06:16:48
Message-ID: CA+hUKG+FkUuDv-bcBns=Z_O-V9QGW0nWZNHOkEPxHZWjegRXvw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Apr 17, 2020 at 2:12 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> What about a contrib function that lets you clobber
> oldSnapshotControl->current_timestamp? It looks like all times in
> this system come ultimately from GetSnapshotCurrentTimestamp(), which
> uses that variable to make sure that time never goes backwards.

Here's a draft TAP test that uses that technique successfully, as a
POC. It should probably be extended to cover more cases, but I
thought I'd check what people thought of the concept first before
going further. I didn't see a way to do overlapping transactions with
PostgresNode.pm, so I invented one (please excuse the bad perl); am I
missing something? Maybe it'd be better to do 002 with an isolation
test instead, but I suppose 001 can't be in an isolation test, since
it needs to connect to multiple databases, and it seemed better to do
them both the same way. It's also not entirely clear to me that
isolation tests can expect a database to be fresh and then mess with
dangerous internal state, whereas TAP tests set up and tear down a
cluster each time.

I think I found another bug in MaintainOldSnapshotTimeMapping(): if
you make time jump by more than old_snapshot_threshold in one go, then
the map gets cleared and then no early pruning or snapshot-too-old
errors happen. That's why in 002_too_old.pl it currently advances
time by 10 minutes twice, instead of 20 minutes once. To be
continued.

Attachment Content-Type Size
v2-0001-Expose-oldSnapshotControl.patch text/x-patch 6.7 KB
v2-0002-contrib-old_snapshot-time-xid-mapping.patch text/x-patch 7.9 KB
v2-0003-Fix-bugs-in-MaintainOldSnapshotTimeMapping.patch text/x-patch 2.6 KB
v2-0004-Add-pg_clobber_current_snapshot_timestamp.patch text/x-patch 1.9 KB
v2-0005-Truncate-old-snapshot-XIDs-before-truncating-CLOG.patch text/x-patch 7.1 KB
v2-0006-Add-TAP-test-for-snapshot-too-old.patch text/x-patch 5.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2020-04-18 07:01:42 Re: 001_rep_changes.pl stalls
Previous Message Justin Pryzby 2020-04-18 05:08:15 Re: Autovacuum on partitioned table (autoanalyze)