fixing old_snapshot_threshold's time->xid mapping

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Kevin Grittner <kgrittn(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: fixing old_snapshot_threshold's time->xid mapping
Date: 2020-04-16 16:41:55
Message-ID: CA+TgmoY=aqf0zjTD+3dUWYkgMiNDegDLFjo+6ze=Wtpik+3XqA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I'm starting a new thread for this, because the recent discussion of
problems with old_snapshot_threshold[1] touched on a lot of separate
issues, and I think it will be too confusing if we discuss all of them
on one thread. Attached are three patches.

0001 makes oldSnapshotControl "extern" rather than "static" and
exposes the struct definition via a header.

0002 adds a contrib module called old_snapshot which makes it possible
to examine the time->XID mapping via SQL. As Andres said, the comments
are not really adequate in the existing code, and the code itself is
buggy, so it was a little hard to be sure that I was understanding the
intended meaning of the different fields correctly. However, I gave it
a shot.

0003 attempts to fix bugs in MaintainOldSnapshotTimeMapping() so that
it produces a sensible mapping. I encountered and tried to fix two
issues here:

First, as previously discussed, the branch that advances the mapping
should not categorically do "oldSnapshotControl->head_timestamp = ts;"
assuming that the head_timestamp is supposed to be the timestamp for
the oldest bucket rather than the newest one. Rather, there are three
cases: (1) resetting the mapping resets head_timestamp, (2) extending
the mapping by an entry without dropping an entry leaves
head_timestamp alone, and (3) overwriting the previous head with a new
entry advances head_timestamp by 1 minute.

Second, the calculation of the number of entries by which the mapping
should advance is incorrect. It thinks that it should advance by the
number of minutes between the current head_timestamp and the incoming
timestamp. That would be correct if head_timestamp were the most
recent entry in the mapping, but it's actually the oldest. As a
result, without this fix, every time we move into a new minute, we
advance the mapping much further than we actually should. Instead of
advancing by 1, we advance by the number of entries that already exist
in the mapping - which means we now have entries that correspond to
times which are in the future, and don't advance the mapping again
until those future timestamps are in the past.

With these fixes, I seem to get reasonably sensible mappings, at least
in light testing. I tried running this in one window with \watch 10:

select *, age(newest_xmin), clock_timestamp() from
pg_old_snapshot_time_mapping();

And in another window I ran:

pgbench -T 300 -R 10

And the age does in fact advance by ~600 transactions per minute.

I'm not proposing to commit anything here right now. These patches
haven't had enough testing for that, and their interaction with other
bugs in the feature needs to be considered before we do anything.
However, I thought it might be useful to put them out for review and
comment, and I also thought that having the contrib module from 0002
might permit other people to do some better testing of this feature
and these fixes.

Thanks,

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

[1] http://postgr.es/m/20200401064008.qob7bfnnbu4w5cw4@alap3.anarazel.de

Attachment Content-Type Size
v1-0001-Expose-oldSnapshotControl.patch application/octet-stream 6.7 KB
v1-0003-Fix-bugs-in-MaintainOldSnapshotTimeMapping.patch application/octet-stream 2.6 KB
v1-0002-contrib-old_snapshot-time-xid-mapping.patch application/octet-stream 7.9 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message James Coleman 2020-04-16 16:42:13 Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Previous Message James Coleman 2020-04-16 16:04:03 Re: sqlsmith crash incremental sort