Re: improving GROUP BY estimation

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Mark Dilger <hornschnorter(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: improving GROUP BY estimation
Date: 2016-03-31 20:57:58
Message-ID: 7833.1459457878@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> writes:
> On 31 March 2016 at 21:40, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Let's use something like this:
>> See "Approximating block accesses in database organizations", S. B. Yao,
>> Communications of the ACM, Volume 20 Issue 4, April 1977 Pages 260-261

Actually, having now looked at both the Dellera paper and the Yao one,
what the latter shows seems to be equivalent to Dellera's equation (2)
(the first one in his section 2.2). But what the code is actually
using is the approximation in the second equation in 2.2, and that
one is certainly not in Yao, although it's not hard to see how you
get from the first to the second if you assume i << Nr.

I think it'd be worth writing out equation (2), attributing it to the Yao
paper, and then saying something like "Alberto Dellera points out that if
Nd is large, so that all the values of i are much less than Nr, then this
formula can be approximated by [the formula used in the code]. It turns
out that that formula also works well even when Nd is small."

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-03-31 20:59:41 Re: improving GROUP BY estimation
Previous Message Robbie Harwood 2016-03-31 20:54:57 Re: [PATCH v9] GSSAPI encryption support