Re: statistic target and sample rate

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Luca Ferrari <fluca1978(at)gmail(dot)com>
Cc: pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: statistic target and sample rate
Date: 2021-07-14 14:30:29
Message-ID: 3484552.1626273029@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Luca Ferrari <fluca1978(at)gmail(dot)com> writes:
> Therefore my question is about how the statistic collectore decides
> about the number of tuples to be sampled.

It's basically 300 times the largest statistics target:

https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/commands/analyze.c;h=0c9591415e4b97dd5c5e693af1860294284a1575;hb=HEAD#l1919

Per that comment, there is good math backing this choice for the task
of making a histogram. It's a little shakier for other sorts of
statistics --- notably, for n_distinct estimation, the error can still
be really bad.

regards, tom lane

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Sasha Aliashkevich 2021-07-14 14:36:22 ERROR: cannot freeze committed xmax
Previous Message Laura Smith 2021-07-14 13:18:51 Re: returning setof from insert ?