Re: fixing old_snapshot_threshold's time->xid mapping

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Kevin Grittner <kgrittn(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: fixing old_snapshot_threshold's time->xid mapping
Date: 2020-04-18 09:27:35
Message-ID: CAFiTN-tCnc7DCD7vo6x58zuacN7s3x9CunTHWhGzWuwc+7vDNA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Apr 18, 2020 at 11:47 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
>
> On Fri, Apr 17, 2020 at 2:12 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> > What about a contrib function that lets you clobber
> > oldSnapshotControl->current_timestamp? It looks like all times in
> > this system come ultimately from GetSnapshotCurrentTimestamp(), which
> > uses that variable to make sure that time never goes backwards.
>
> Here's a draft TAP test that uses that technique successfully, as a
> POC. It should probably be extended to cover more cases, but I
> thought I'd check what people thought of the concept first before
> going further. I didn't see a way to do overlapping transactions with
> PostgresNode.pm, so I invented one (please excuse the bad perl); am I
> missing something? Maybe it'd be better to do 002 with an isolation
> test instead, but I suppose 001 can't be in an isolation test, since
> it needs to connect to multiple databases, and it seemed better to do
> them both the same way. It's also not entirely clear to me that
> isolation tests can expect a database to be fresh and then mess with
> dangerous internal state, whereas TAP tests set up and tear down a
> cluster each time.
>
> I think I found another bug in MaintainOldSnapshotTimeMapping(): if
> you make time jump by more than old_snapshot_threshold in one go, then
> the map gets cleared and then no early pruning or snapshot-too-old
> errors happen. That's why in 002_too_old.pl it currently advances
> time by 10 minutes twice, instead of 20 minutes once. To be
> continued.

IMHO that doesn't seems to be a problem. Because even if we jump more
than old_snapshot_threshold in one go we don't clean complete map
right. The latest snapshot timestamp will become the headtimestamp.
So in TransactionIdLimitedForOldSnapshots if (current_ts -
old_snapshot_threshold) is still >= head_timestap then we can still do
early pruning.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ranier Vilela 2020-04-18 11:42:27 Re: PG compilation error with Visual Studio 2015/2017/2019
Previous Message Michael Paquier 2020-04-18 09:26:11 Re: [BUG] non archived WAL removed during production crash recovery