Skip site navigation (1) Skip section navigation (2)

Re: Gsoc2012 idea, tablesample

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Qi Huang <huangqiyx(at)hotmail(dot)com>
Cc: heikki(dot)linnakangas(at)enterprisedb(dot)com, josh(at)agliodbs(dot)com,pgsql-hackers(at)postgresql(dot)org, andres(at)anarazel(dot)de,alvherre(at)commandprompt(dot)com, neil(dot)conway(at)gmail(dot)com,daniel(at)heroku(dot)com, cbbrowne(at)gmail(dot)com, kevin(dot)grittner(at)wicourts(dot)gov
Subject: Re: Gsoc2012 idea, tablesample
Date: 2012-04-17 15:27:16
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-hackers

* Qi Huang (huangqiyx(at)hotmail(dot)com) wrote:
> > Doing it 'right' certainly isn't going to be simply taking what Neil did
> > and updating it, and I understand Tom's concerns about having this be
> > more than a hack on seqscan, so I'm a bit nervous that this would turn
> > into something bigger than a GSoC project.
> As Christopher Browne mentioned, for this sampling method, it is not possible without scanning the whole data set. It improves the sampling quality but increases the sampling cost. I think it should also be using only for some special sampling types, not for general. The general sampling methods, as in the SQL standard, should have only SYSTEM and BERNOULLI methods. 

I'm not sure what sampling method you're referring to here.  I agree
that we need to be looking at implementing the specific sampling methods
listed in the SQL standard.  How much information is provided in the
standard about the requirements placed on these sampling methods?  Does
the SQL standard only define SYSTEM and BERNOULLI?  What do the other
databases support?  What does SQL say the requirements are for 'SYSTEM'?



In response to


pgsql-hackers by date

Next:From: Tom LaneDate: 2012-04-17 16:14:41
Subject: Re: Parameterized-path cost comparisons need some work
Previous:From: Qi HuangDate: 2012-04-17 15:21:24
Subject: Re: Gsoc2012 idea, tablesample

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group