Re: judging acceptable discrepancy in row count v. estimate

From: Rob Sargent <robjsargent(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: judging acceptable discrepancy in row count v. estimate
Date: 2018-10-16 20:07:04
Message-ID: 65FFDBB8-23F3-46F4-81CD-2748B271EA58@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> On Oct 16, 2018, at 1:01 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Rob Sargent <robjsargent(at)gmail(dot)com> writes:
>> Should reality be half again as large as the estimated row count?
>> coon=# select count(*) from sui.segment;
>> count
>> ----------
>> 49,942,837 -- my commas
>> (1 row)
>
>> coon=# vacuum (analyse, verbose) sui.probandset;
>
> Uh, what does sui.probandset have to do with sui.segment ?
>
> regards, tom lane

In fullness,

INFO: analyzing "sui.segment"
INFO: "segment": scanned 30000 of 1019242 pages, containing 1470000 live rows and 0 dead rows; 30000 rows in sample, 49942858 estimated total rows
VACUUM
Time: 321934.748 ms (05:21.935)

So, rather accurately estimated (no inserts, deletes) since bogus report. Looks like its good to 5+ decimal places, given sufficient records.

select 49942858.0/49942837.0;
?column?
------------------------
1.00000042048071878656
(1 row)

This table has no variable length columns. I imagine that helps.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Ravi Krishna 2018-10-17 00:25:54 postgres server process crashes when using odbc_fdw
Previous Message Adrian Klaver 2018-10-16 19:30:02 Re: FATAL: terminating connection because protocol synchronization was lost