On 2015-07-12 18:02, Tom Lane wrote:
>
> A possible way around this problem is to redefine the sampling rule so
> that it is not history-dependent but depends only on the tuple TIDs.
> For instance, one could hash the TID of a candidate tuple, xor that with
> a hash of the seed being used for the current query, and then select the
> tuple if (hash/MAXINT) < P.
>
That would work for bernoulli for physical tuples, yes. Only thing that
worries me is future extensibility for data sources that only provide
virtual tuples.
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services