Re: TABLESAMPLE patch

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Petr Jelinek <petr(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Tomas Vondra <tv(at)fuzzy(dot)cz>
Subject: Re: TABLESAMPLE patch
Date: 2015-04-09 08:12:20
Message-ID: CAB7nPqSSfWAxdkOGf0h+zNoAF0Uuv_=ScYjaLRWyTnk_hRxf4A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Apr 9, 2015 at 4:30 AM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
> On Mon, Apr 6, 2015 at 11:02 AM, Petr Jelinek <petr(at)2ndquadrant(dot)com> wrote:
>>
>> On 06/04/15 14:30, Petr Jelinek wrote:
>>>
>>> On 06/04/15 11:02, Simon Riggs wrote:
>>>
>>>> Are we ready for a final detailed review and commit?
>>>>
>>>
>>> I plan to send v12 in the evening with some additional changes that came
>>> up from Amit's comments + some improvements to error reporting. I think
>>> it will be ready then.
>>>
>>
>> Ok so here it is.
>>
>> Changes vs v11:
>> - changed input parameter list to expr_list
>> - improved error reporting, particularly when input parameters are wrong
>> - fixed SELECT docs to show correct syntax and mention that there can be
>> more sampling methods
>> - added name of the sampling method to the explain output - I don't like
>> the code much there as it has to look into RTE but on the other hand I don't
>> want to create new scan node just so it can hold the name of the sampling
>> method for explain
>> - made views containing TABLESAMPLE clause not autoupdatable
>> - added PageIsAllVisible() check before trying to check for tuple
>> visibility
>> - some typo/white space fixes
>
>
>
> Compiler warnings:
> explain.c: In function 'ExplainNode':
> explain.c:861: warning: 'sname' may be used uninitialized in this function
>
>
> Docs spellings:
>
> "PostgreSQL distrribution" extra r.
>
> "The optional parameter REPEATABLE acceps" accepts. But I don't know that
> 'accepts' is the right word. It makes the seed value sound optional to
> REPEATABLE.
>
> "each block having same chance" should have "the" before "same".
>
> "Both of those sampling methods currently...". I think it should be "these"
> not "those", as this sentence is immediately after their introduction, not
> at a distance.
>
> "...tuple contents and decides if to return in, or zero if none" Something
> here is confusing. "return it", not "return in"?
>
> Other comments:
>
> Do we want tab completions for psql? (I will never remember how to spell
> BERNOULLI).

Yes. I think so.

> Needs a Cat version bump.

The committer who will pick up this patch will normally do it.

Patch 1 is simple enough and looks fine to me.

Regarding patch 2...
I found for now some typos:
+ <title><structname>pg_tabesample_method</structname></title>
+ <productname>PostgreSQL</productname> distrribution:

Also, I am wondering if the sampling logic based on block analysis is
actually correct, for example for now this fails and I think that we
should support it:
=# with query_select as (select generate_series(1, 10) as a) select
query_select.a from query_select tablesample system (100.0/11)
REPEATABLE (9999);
ERROR: 42P01: relation "query_select" does not exist

How does the SQL spec define exactly TABLESAMPLE? Shouldn't we get a
sample from a result set?
Thoughts?
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2015-04-09 08:25:10 Re: How about to have relnamespace and relrole?
Previous Message Shigeru HANADA 2015-04-09 07:50:31 Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API)