Re: ANALYZE sampling is too good

From: Jim Nasby <jim(at)nasby(dot)net>
To: Peter Geoghegan <pg(at)heroku(dot)com>, Greg Stark <stark(at)mit(dot)edu>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ANALYZE sampling is too good
Date: 2013-12-10 23:26:56
Message-ID: 52A7A340.9070801@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12/10/13 2:17 PM, Peter Geoghegan wrote:
> On Tue, Dec 10, 2013 at 11:59 AM, Greg Stark <stark(at)mit(dot)edu> wrote:
>> But I don't really think this is the right way to go about this.
>> Research papers are going to turn up pretty specialized solutions that
>> are probably patented. We don't even have the basic understanding we
>> need. I suspect a basic textbook chapter on multistage sampling will
>> discuss at least the standard techniques.
>
> I agree that looking for information on block level sampling
> specifically, and its impact on estimation quality is likely to not
> turn up very much, and whatever it does turn up will have patent
> issues.

We have an entire analytics dept. at work that specializes in finding patterns in our data. I might be able to get some time from them to at least provide some guidance here, if the community is interested. They could really only serve in a consulting role though.
--
Jim C. Nasby, Data Architect jim(at)nasby(dot)net
512.569.9461 (cell) http://jim.nasby.net

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2013-12-10 23:26:58 Re: Dynamic Shared Memory stuff
Previous Message Daniel Farina 2013-12-10 23:24:00 Re: pg_stat_statements fingerprinting logic and ArrayExpr