Re: getting rid of SnapshotNow

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: getting rid of SnapshotNow
Date: 2013-07-23 15:25:27
Message-ID: CA+TgmoZ_q2KMkxZAoRxRHB7k1tOmjVjQgYt2JuA7=U7QZoLxNw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-odbc

On Thu, Jul 18, 2013 at 8:46 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> There seems to be a consensus that we should try to get rid of
> SnapshotNow entirely now that we have MVCC catalog scans, so I'm
> attaching two patches that together come close to achieving that goal:
>
> 1. snapshot-error-v1.patch introduces a new special snapshot, called
> SnapshotError. In the cases where we set SnapshotNow as a sort of
> default snapshot, this patch changes the code to use SnapshotError
> instead. This affects scan->xs_snapshot in genam.c and
> estate->es_snapshot in execUtils.c. This passes make check-world, so
> apparently there is no code in the core distribution that does this.
> However, this is safer for third-party code, which will ERROR instead
> of seg faulting. The alternative approach would be to use
> InvalidSnapshot, which I think would be OK too if people dislike this
> approach.

It seems the consensus was mildly for InvalidSnapshot, so I did it that way.

> 2. snapshot-self-not-now-v1.patch changes several uses of SnapshotNow
> to use SnapshotSelf instead. These include pgrowlocks(),
> pgstat_heap(), and get_actual_variable_range(). In all of those
> cases, only an approximately-correct answer is needed, so the change
> should be fine. I'd also generally expect that it's very unlikely for
> any of these things to get called in a context where the table being
> scanned has been updated by the current transaction after the most
> recent command-counter increment, so in practice the change in
> semantics will probably not be noticeable at all.

Tom proposed that we use SnapshotDirty for this case; let me just ask
whether there are any security concerns around that. pgstattuple only
displays aggregate information so I think that's OK, but I wonder if
the value found in get_actual_variable_range() can leak out in EXPLAIN
output or whatever. I can't particularly think of any reason why that
would actually matter, but I've generally shied away from exposing
data written by uncommitted transactions, and this would be a step in
the other direction. Does this worry anyone else or am I being
paranoid?

But thinking about it a little more, I wonder why
get_actual_variable_range() is using a snapshot at all. Presumably
what we want there is to find the last index key, regardless of the
visibility of the heap tuple to which it points. We don't really need
to consult the heap at all, one would think; the value we need ought
to be present in the index tuple. If we're going to use a snapshot
for simplicity of coding, maybe the right thing is SnapshotAny. After
all, even if the index tuples are all dead, we still have to scan
them, so it's still relevant for costing purposes.

Thoughts?

> With that done, the only remaining uses of SnapshotNow in our code
> base will be in currtid_byreloid() and currtid_byrelname(). So far no
> one on this list has been able to understand clearly what the purpose
> of those functions is, so I'm copying this email to pgsql-odbc in case
> someone there can provide more insight. If I were a betting man, I'd
> bet that they are used in contexts where the difference between
> SnapshotNow and SnapshotSelf wouldn't matter there, either. For
> example, if those functions are always invoked in a query that does
> nothing but call those functions, the difference wouldn't be visible.
> If we don't want to risk any change to the semantics, we can (1) grit
> our teeth and keep SnapshotNow around or (2) use an instantaneous MVCC
> snapshot there, and accept that people who have very large connection
> counts and extremely heavy use of those functions may see a
> performance regression.

It seems like we're leaning toward a fresh MVCC snapshot for this case.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Atri Sharma 2013-07-23 15:32:58 Re: Bloom Filter lookup for hash joins
Previous Message Robert Haas 2013-07-23 14:56:29 Re: Design proposal: fsync absorb linear slider

Browse pgsql-odbc by date

  From Date Subject
Next Message Tom Lane 2013-07-23 16:28:34 Re: getting rid of SnapshotNow
Previous Message onur gulsevgi 2013-07-23 14:39:32 Re: Fwd: configure: error: unixODBC library "odbcinst" not found while trying to compile odbc