Re: Bug? Small samples in TABLESAMPLE SYSTEM returns zero rows

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Bug? Small samples in TABLESAMPLE SYSTEM returns zero rows
Date: 2015-08-06 19:45:02
Message-ID: 21502.1438890302@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> On 6 August 2015 at 20:14, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>> Speaking from a user perspective, SYSTEM seems broken to me. I can't
>> imagine using it for anything with a that degree of variation in the
>> number of results returned, especially if it's possible to return zero
>> rows from a populated table.

> Please bear in mind you have requested a very small random sample of blocks.

Indeed. My expectation about it is that you'd get the requested number of
rows *on average* over many tries (which is pretty much what Josh's
results show). Since what SYSTEM actually returns must be a multiple of
the number of rows per page, if you make a request that's less than that
number of rows, you must get zero rows some of the time. Otherwise the
sampling logic is cheating.

I do *not* think that we should force the sample to contain at least one
page, which is the only way that we could satisfy the complaint as stated.

Perhaps we need to adjust the documentation to make it clearer that
block-level sampling is not the thing to use if you want a sample that
doesn't amount to a reasonable number of blocks. But I see absolutely
no evidence here that the sampling isn't behaving exactly as expected.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2015-08-06 19:57:04 Re: Bug? Small samples in TABLESAMPLE SYSTEM returns zero rows
Previous Message Tom Lane 2015-08-06 19:36:05 pgsql: Further fixes for degenerate outer join clauses.