Re: [HACKERS] Re: pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold <

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Kevin Grittner <kgrittn(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Ants Aasma <ants(dot)aasma(at)eesti(dot)ee>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Kevin Grittner <kgrittn(at)postgresql(dot)org>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, "pgsql-committers(at)postgresql(dot)org" <pgsql-committers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Re: pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold <
Date: 2016-05-31 15:03:12
Message-ID: CA+TgmoavJO6=xd7zj89jbSRoYbV+EchPuHkWNG3H4vMzX1g5QA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On Fri, May 27, 2016 at 10:58 AM, Kevin Grittner <kgrittn(at)gmail(dot)com> wrote:
>> As far as I can see normal index builds will allow concurrent hot
>> prunes and everything; since those only require page-level
>> exclusive locks.
>>
>> So for !concurrent builds we might end up with a corrupt index.
>
> ... by which you mean an index that omits certainly-dead heap
> tuples which have been the subject of early pruning or vacuum, even
> if there are still registered snapshots that include those tuples?
> Or do you see something else?

I think that is the danger.

> Again, since both the heap pages involved and all the new index
> pages would have LSNs recent enough to trigger the old snapshot
> check, I'm not sure where the problem is, but will review closely
> to see what I might have missed.

This is a good point, though, I think.

>> I think many of the places relying on heap scans with !mvcc
>> snapshots are problematic in that way. Outdated reads will not be
>> detected by TestForOldSnapshot() (given the (snapshot)->satisfies
>> == HeapTupleSatisfiesMVCC condition therein), and rely on the
>> snapshot to be actually working. E.g. CLUSTER/ VACUUM FULL rely
>> on accurate HEAPTUPLE_RECENTLY_DEAD
>
> Don't the "RECENTLY" values imply that the transaction is still
> running which cause the tuple to be dead? Since tuples are not
> subject to early pruning or vacuum when that is the case, where do
> you see a problem? The snapshot itself has the usual xmin et al.,
> so I'm not sure what you might mean by "the snapshot to be actually
> working" if not the override at the time of pruning/vacuuming.

Anybody who calls HeapTupleSatisfiesVacuum() with an xmin value that
is newer that the oldest registered snapshot in the system (based on
some snapshots being ignored) might get a return value of
HEAPTUPLE_DEAD rather than HEAPTUPLE_RECENTLY_DEAD. It seems
necessary to carefully audit all calls to HeapTupleSatisfiesVacuum()
to see whether that difference matters. I took a quick look and
here's what I see:

statapprox_heap(): Statistical information for the DBA. The
difference is non-critical.

heap_prune_chain(): Seeing the tuple as dead might cause it to be
removed early. This should be OK. Removing the tuple early will
cause the page LSN to be bumped unless RelationNeedsWAL() returns
false, and TransactionIdLimitedForOldSnapshots() includes that as a
condition for disabling early pruning.

IndexBuildHeapRangeScan(): We might end up with indexIt = false
instead of indexIt = true. That should be OK because anyone using the
old snapshot will see a new page LSN and error out. We might also
fail to set indexInfo->ii_BrokenHotChain = true. I suspect that's a
problem, but I'm not certain of it.

acquire_sample_rows: Both return values are treated in the same way.
No problem.

copy_heap_data: We'll end up setting isdead = true instead of
tups_recently_dead += 1. That means that the new heap won't include
the tuple, which is OK because old snapshots can't read the new heap
without erroring out, assuming that the new heap has LSNs. The xmin
used here comes from vacuum_set_xid_limits() which goes through
TransactionIdLimitedForOldSnapshots() so this should be OK for the
same reasons as heap_prune_chain(). Another effect of seeing the
tuple as prematurely dead is that we'll call rewrite_heap_dead_tuple()
on it; rewrite_heap_dead_tuple() will presume that if this tuple is
dead, its predecessor in the ctid chain is also dead. I don't see any
obvious problem with that.

lazy_scan_heap(): Basically, the same thing as heap_prune_chain().

CheckForSerializableConflictOut: Maybe a problem? If the tuple is
dead, there's no issue, but if it's recently-dead, there might be.

We might want to add comments to some of these places addressing
snapshot_too_old specifically.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Tom Lane 2016-05-31 16:05:37 pgsql: Fix typo in CREATE DATABASE syntax synopsis.
Previous Message Noah Misch 2016-05-31 04:02:43 pgsql: Mirror struct Aggref field order in _copyAggref().

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-05-31 16:15:51 Re: Rename max_parallel_degree?
Previous Message Konstantin Knizhnik 2016-05-31 14:52:46 Logical replication & oldest XID.