Re: incorrect information in documentation

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: ivanmulhin(at)gmail(dot)com, Pg Docs <pgsql-docs(at)lists(dot)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: incorrect information in documentation
Date: 2021-08-10 03:40:20
Message-ID: CAKFQuwaV87e4sdFdZab4+zcUdza+EpveQi_98=f9wJ1+QLUALw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs

On Mon, Aug 9, 2021 at 11:05 AM Bruce Momjian <bruce(at)momjian(dot)us> wrote:

>
> > selectivity = (1 - null_frac1) * (1 - null_frac2) * min(1/
> > num_distinct1,
> > 1/num_distinct2)
> > = (1 - 0) * (1 - 0) / max(10000, 10000)
> > = 0.0001
>
> Nice, can you provide a patch please?
>
>
Change the line:

selectivity = (1 - null_frac1) * (1 - null_frac2) * min(1/num_distinct1,
1/num_distinct2)

to be:

selectivity = (1 - null_frac1) * (1 - null_frac2) / max(num_distinct1,
num_distinct2)

The wording already talks about "divide by max".

Though:

"so we use an algorithm that relies only on the number of distinct values
for both relations together with their null fractions:"

maybe adds a parenthetical note:

"so we use an algorithm that relies only on the number of distinct values
(the row count estimate for the whole table, not the -1 in the column
statistics) for both relations together with their null fractions:"

Just note I haven't tried to absorb that whole page, let alone the
implementation, and am not all that familiar with this part of PostgreSQL.
Its seems right, though, in isolation.

David J.

In response to

Responses

Browse pgsql-docs by date

  From Date Subject
Next Message PG Doc comments form 2021-08-17 16:11:53 Potential vuln in example for "F.25.1.1. digest()"
Previous Message Bruce Momjian 2021-08-09 18:05:50 Re: incorrect information in documentation