Re: fixing old_snapshot_threshold's time->xid mapping

From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Kevin Grittner <kgrittn(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: fixing old_snapshot_threshold's time->xid mapping
Date: 2020-04-16 17:14:41
Message-ID: 20200416171441.chqp44hbrxyblz5q@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2020-04-16 12:41:55 -0400, Robert Haas wrote:
> I'm starting a new thread for this, because the recent discussion of
> problems with old_snapshot_threshold[1] touched on a lot of separate
> issues, and I think it will be too confusing if we discuss all of them
> on one thread. Attached are three patches.

Cool.

> 0003 attempts to fix bugs in MaintainOldSnapshotTimeMapping() so that
> it produces a sensible mapping. I encountered and tried to fix two
> issues here:
>
> First, as previously discussed, the branch that advances the mapping
> should not categorically do "oldSnapshotControl->head_timestamp = ts;"
> assuming that the head_timestamp is supposed to be the timestamp for
> the oldest bucket rather than the newest one. Rather, there are three
> cases: (1) resetting the mapping resets head_timestamp, (2) extending
> the mapping by an entry without dropping an entry leaves
> head_timestamp alone, and (3) overwriting the previous head with a new
> entry advances head_timestamp by 1 minute.

> Second, the calculation of the number of entries by which the mapping
> should advance is incorrect. It thinks that it should advance by the
> number of minutes between the current head_timestamp and the incoming
> timestamp. That would be correct if head_timestamp were the most
> recent entry in the mapping, but it's actually the oldest. As a
> result, without this fix, every time we move into a new minute, we
> advance the mapping much further than we actually should. Instead of
> advancing by 1, we advance by the number of entries that already exist
> in the mapping - which means we now have entries that correspond to
> times which are in the future, and don't advance the mapping again
> until those future timestamps are in the past.
>
> With these fixes, I seem to get reasonably sensible mappings, at least
> in light testing. I tried running this in one window with \watch 10:
>
> select *, age(newest_xmin), clock_timestamp() from
> pg_old_snapshot_time_mapping();
>
> And in another window I ran:
>
> pgbench -T 300 -R 10
>
> And the age does in fact advance by ~600 transactions per minute.

I still think we need a way to test this without waiting for hours to
hit various edge cases. You argued against a fixed binning of
old_snapshot_threshold/100 arguing its too coarse. How about a 1000 or
so? For 60 days, the current max for old_snapshot_threshold, that'd be a
granularity of 01:26:24, which seems fine. The best way I can think of
that'd keep current GUC values sensible is to change
old_snapshot_threshold to be float. Ugly, but ...?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2020-04-16 17:20:04 Re: Race condition in SyncRepGetSyncStandbysPriority
Previous Message Tom Lane 2020-04-16 17:10:37 Re: [PATCH] Incremental sort (was: PoC: Partial sort)