Re: Row estimates for empty tables

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Christophe Pettus <xof(at)thebuild(dot)com>, postgre <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Row estimates for empty tables
Date: 2020-07-24 13:48:36
Message-ID: 373968.1595598516@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

David Rowley <dgrowleyml(at)gmail(dot)com> writes:
> On Fri, 24 Jul 2020 at 16:01, Christophe Pettus <xof(at)thebuild(dot)com> wrote:
>> I realize I've never quite known this; where does the planner get the row estimates for an empty table? Example:

> We just assume there are 10 pages if the relation has not yet been
> vacuumed or analyzed. The row estimates you see are the number of
> times 1 tuple is likely to fit onto a single page multiplied by the
> assumed 10 pages. If you had made your table wider then the planner
> would have assumed fewer rows

Yeah. Also note that since we have no ANALYZE stats in this scenario,
the row width estimate is going to be backed into via some guesses
based on column data types. (It's fine for fixed-width types, much
less fine for var-width.)

There's certainly not a lot besides tradition to justify the exact
numbers used in this case. However, we do have a good deal of
practical experience to justify the principle of "never assume a
table is empty, or even contains just one row, unless you're really
sure of that". Otherwise you tend to end up with nestloop joins that
will perform horrifically if you were wrong. The other join types
are notably less brittle.

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Christophe Pettus 2020-07-24 14:38:08 Re: Row estimates for empty tables
Previous Message David Rowley 2020-07-24 04:56:49 Re: Row estimates for empty tables

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2020-07-24 13:53:50 Re: INSERT INTO SELECT, Why Parallelism is not selected?
Previous Message Dilip Kumar 2020-07-24 13:47:35 Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions