Re: sample scans and predicate locking

From: Andres Freund <andres(at)anarazel(dot)de>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, Kevin Grittner <kgrittn(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: sample scans and predicate locking
Date: 2019-05-19 22:22:59
Message-ID: 20190519222259.2gdlswllvf6z45rr@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2019-05-19 13:57:42 +1200, Thomas Munro wrote:
> On Sun, May 19, 2019 at 8:31 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > While looking at fixing [1] on master, I noticed the following
> > codeblock:
> >
> > static HeapScanDesc
> > heap_beginscan_internal(Relation relation, Snapshot snapshot,
> > int nkeys, ScanKey key,
> > ParallelHeapScanDesc parallel_scan,
> > bool allow_strat,
> > bool allow_sync,
> > bool allow_pagemode,
> > bool is_bitmapscan,
> > bool is_samplescan,
> > bool temp_snap)
> > ...
> > /*
> > * For a seqscan in a serializable transaction, acquire a predicate lock
> > * on the entire relation. This is required not only to lock all the
> > * matching tuples, but also to conflict with new insertions into the
> > * table. In an indexscan, we take page locks on the index pages covering
> > * the range specified in the scan qual, but in a heap scan there is
> > * nothing more fine-grained to lock. A bitmap scan is a different story,
> > * there we have already scanned the index and locked the index pages
> > * covering the predicate. But in that case we still have to lock any
> > * matching heap tuples.
> > */
> > if (!is_bitmapscan)
> > PredicateLockRelation(relation, snapshot);
> >
> > As you can see this only tests for is_bitmapscan, *not* for
> > is_samplescan. Which means we afaict currently every sample scan
> > predicate locks the entire relation.
>
> Right, I just tested that. That's not wrong though, is it? It's just
> overly pessimistic.

Yea, I was mostly commenting on the fact that the comment doesn't
mention sample scans, so it looks a bit accidental.

I added a comment to master (as part of a fix, where this codepath was
entered inadvertently)

> > I think there's two possibilities here:
> >
> > 1) It's just the comment that's inaccurate, and it should really talk
> > about both seqscans and sample scans. It should not be necessary to
> > lock the whole relation, but I'm not sure the code otherwise takes
> > enough care.
> >
> > 2) We should really not predicate lock the entire relation. In which
> > case I think there might be missing PredicateLockTuple/Page calls.
>
> Yeah, we could probably predicate-lock pages in
> heapam_scan_sample_next_block() and tuples in
> heapam_scan_sample_next_tuple(), instead of doing this. Seems like a
> reasonable improvement for 13. But... hmm.... There *might* be a
> theoretical argument about TABLESAMPLE(100) behaving differently if
> done per page rather than if done at relation level, wrt new pages
> added to the end later and therefore missed. And then by logical
> extension, TABLESAMPLE of any percentage. I'm not sure.

I don't think that's actually a problem, tablesample doesn't return
invisible rows. And the equivalent issue is true just as well for index
and bitmap heap scans?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2019-05-19 22:36:43 Re: Statistical aggregate functions are not working with PARTIAL aggregation
Previous Message Andres Freund 2019-05-19 22:17:04 Re: Segfault on ANALYZE in SERIALIZABLE isolation