Re: TABLESAMPLE patch

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Petr Jelinek <petr(at)2ndquadrant(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Tomas Vondra <tv(at)fuzzy(dot)cz>
Subject: Re: TABLESAMPLE patch
Date: 2015-04-12 13:30:45
Message-ID: CA+U5nMLj-ZghN4i=ZEEZTKdf6mmOYTMd9KtnXXbt+CEoOBJM2g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 10 April 2015 at 15:26, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:

> What is your intended use case for this feature?

Likely use cases are:
* Limits on numbers of rows in sample. Some research colleagues have
published a new mathematical analysis that will allow a lower limit
than previously considered.
* Time limits on sampling. This allows data visualisation approaches
to gain approximate answers in real time.
* Stratified sampling. Anything with some kind of filtering, lifting
or bias. Allows filtering out known incomplete data.
* Limits on sample error

Later use cases would allow custom aggregates to work together with
custom sampling methods, so we might work our way towards i) an SUM()
function that provides the right answer even when used with a sample
scan, ii) custom aggregates that report the sample error, allowing you
to get both AVG() and AVG_STDERR(). That would be technically possible
with what we have here, but I think a lot more thought required yet.

These have all come out of detailed discussions with two different
groups of data mining researchers.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, RemoteDBA, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2015-04-12 17:12:50 Re: SSL information view
Previous Message 彭瑞华 2015-04-12 11:52:33 Re: WIP Patch for GROUPING SETS phase 1