pgsql: Fix BRIN to use SnapshotAny during summarization

From: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: pgsql-committers(at)postgresql(dot)org
Subject: pgsql: Fix BRIN to use SnapshotAny during summarization
Date: 2015-08-05 19:21:22
Message-ID: E1ZN4Fe-0000Nn-KB@gemulon.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

Fix BRIN to use SnapshotAny during summarization

For correctness of summarization results, it is critical that the
snapshot used during the summarization scan is able to see all tuples
that are live to all transactions -- including tuples inserted or
deleted by in-progress transactions. Otherwise, it would be possible
for a transaction to insert a tuple, then idle for a long time while a
concurrent transaction executes summarization of the range: this would
result in the inserted value not being considered in the summary.
Previously we were trying to use a MVCC snapshot in conjunction with
adding a "placeholder" tuple in the index: the snapshot would see all
committed tuples, and the placeholder tuple would catch insertions by
any new inserters. The hole is that prior insertions by transactions
that are still in progress by the time the MVCC snapshot was taken were
ignored.

Kevin Grittner reported this as a bogus error message during vacuum with
default transaction isolation mode set to repeatable read (because the
error report mentioned a function name not being invoked during), but
the problem is larger than that.

To fix, tweak IndexBuildHeapRangeScan to have a new mode that behaves
the way we need using SnapshotAny visibility rules. This change
simplifies the BRIN code a bit, mainly by removing large comments that
were mistaken. Instead, rely on the SnapshotAny semantics to provide
what it needs. (The business about a placeholder tuple needs to remain:
that covers the case that a transaction inserts a a tuple in a page that
summarization already scanned.)

Discussion: https://www.postgresql.org/message-id/20150731175700.GX2441@postgresql.org

In passing, remove a couple of unused declarations from brin.h and
reword a comment to be proper English. This part submitted by Kevin
Grittner.

Backpatch to 9.5, where BRIN was introduced.

Branch
------
master

Details
-------
http://git.postgresql.org/pg/commitdiff/2834855cb9fde734ce12f59694522c10bf0c0205

Modified Files
--------------
src/backend/access/brin/brin.c | 41 +++++++----------------------
src/backend/catalog/index.c | 36 +++++++++++++++++++++++++-
src/include/access/brin.h | 2 --
src/include/catalog/index.h | 1 +
src/test/isolation/expected/brin-1.out | 39 ++++++++++++++++++++++++++++
src/test/isolation/isolation_schedule | 1 +
src/test/isolation/specs/brin-1.spec | 44 ++++++++++++++++++++++++++++++++
7 files changed, 129 insertions(+), 35 deletions(-)

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Andrew Dunstan 2015-08-05 20:25:44 pgsql: Remove carriage returns from certain tap test output under Msys
Previous Message Tom Lane 2015-08-05 18:39:45 pgsql: Make real sure we don't reassociate joins into or out of SEMI/AN

Browse pgsql-hackers by date

  From Date Subject
Next Message Petr Jelinek 2015-08-05 19:24:13 Re: max_worker_processes on the standby
Previous Message Jim Nasby 2015-08-05 19:20:29 Re: deparsing utility commands